feat: Add Advanced Network Media operators and applications for SMPTE 2110 compliant streaming#970
Conversation
|
@grlee77 if/when you have any BW, this link would help reviewing just the proper commits, as this is based on #813 : ronyrad/holohub@ano_rivermax_update_mar_25/adv_network_rivermax_manager_update_pr_candidate_01...ronyrad:holohub:ano_rivermax_update_mar_25/adv_network_media_operators_dev_01 |
|
Hi @ronyrad, could you please resolve the conflicts for this PR and update it with the latest changes on |
69f2586 to
ee4c2f0
Compare
There was a problem hiding this comment.
Greptile Overview
Greptile Summary
This PR adds comprehensive media streaming capabilities to Holoscan Holohub with two new operators (AdvNetworkMediaRxOp and AdvNetworkMediaTxOp) and sample applications demonstrating SMPTE 2110-compliant streaming over Rivermax.
Key Changes (Last 3 Commits):
-
Adaptive Burst Pool Management (commit 9b8c5e9): Implements intelligent burst dropping system with configurable thresholds (25% warning, 10% critical, 50% recovery) to prevent memory pool exhaustion during high-throughput operations. The system enters critical dropping mode when pool capacity falls below 10%, drops new bursts to preserve system stability, and exits dropping mode when capacity recovers to 50%.
-
Lint Fixes (commit 7b68d26): Code style and formatting improvements
-
Dockerfile Update (commit ee4c2f0): Build configuration updates
New Operators:
AdvNetworkMediaRxOp: Receives SMPTE 2110 streams, converts packets to frames, outputs as VideoBuffer or Tensor entitiesAdvNetworkMediaTxOp: Transmits VideoBuffer/Tensor frames over Rivermax with congestion detection
Sample Applications:
adv_networking_media_player: Demonstrates reception with visualization (HolovizOp) or file output (FramesWriter)adv_networking_media_sender: Demonstrates transmission from VideoStreamReplayer
Architecture Quality:
The implementation follows solid software engineering practices with clear separation of concerns (frame assembly, burst processing, memory management), comprehensive error handling, and extensive logging for diagnostics.
Confidence Score: 5/5
- This PR is safe to merge with well-designed architecture and comprehensive implementation
- The implementation demonstrates excellent software engineering practices with proper memory management, thread safety via mutexes and atomics, configurable adaptive burst dropping to prevent exhaustion, comprehensive error handling throughout the pipeline, clear separation of concerns with dedicated classes for frame assembly and burst processing, and extensive diagnostic logging. The adaptive burst pool management addresses a critical production concern (memory exhaustion under high load) with a well-thought-out state machine approach.
- No files require special attention - all implementations follow best practices
Important Files Changed
File Analysis
| Filename | Score | Overview |
|---|---|---|
| operators/advanced_network/advanced_network/managers/rivermax/rivermax_mgr_impl/burst_manager.cpp | 5/5 | Implements adaptive burst pool management with configurable thresholds (25%/10%/50%) for warning/critical/recovery levels to prevent memory exhaustion |
| operators/advanced_network/advanced_network/managers/rivermax/rivermax_mgr_impl/burst_manager.h | 5/5 | Adds pool capacity monitoring methods, adaptive dropping configuration, and statistics tracking for burst management |
| operators/advanced_network_media/advanced_network_media_rx/adv_network_media_rx.cpp | 5/5 | New media RX operator handling SMPTE 2110 packet-to-frame assembly with support for VideoBuffer/Tensor outputs and configurable memory locations |
| operators/advanced_network_media/advanced_network_media_tx/adv_network_media_tx.cpp | 5/5 | New media TX operator for transmitting VideoBuffer/Tensor frames over Rivermax with congestion detection and frame dropping |
| applications/adv_networking_media_player/cpp/adv_networking_media_player.cpp | 5/5 | Sample application demonstrating media reception pipeline with AdvNetworkMediaRxOp → HolovizOp/FramesWriter for visualization or file output |
| operators/advanced_network/advanced_network/managers/rivermax/rivermax_mgr_impl/rivermax_config_manager.cpp | 5/5 | Adds YAML parsing for burst_pool_adaptive_dropping configuration with validation and default values |
| operators/advanced_network/advanced_network/managers/rivermax/rivermax_mgr_impl/rivermax_mgr_service.cpp | 5/5 | Applies burst pool configuration from YAML to RX services (IPO and RTP receivers) during service initialization |
Sequence Diagram
sequenceDiagram
participant App as Application
participant TxOp as AdvNetworkMediaTxOp
participant TxMgr as Rivermax TX Manager
participant Network as Network Interface
participant RxMgr as Rivermax RX Manager
participant BurstMgr as RxBurstsManager
participant RxOp as AdvNetworkMediaRxOp
participant Assembler as MediaFrameAssembler
participant Viz as HolovizOp/FramesWriter
Note over App,TxOp: Sender Pipeline
App->>TxOp: VideoBuffer/Tensor frame
TxOp->>TxOp: validate_frame_parameters()
TxOp->>TxMgr: create_tx_burst_params()
TxMgr-->>TxOp: burst params
TxOp->>TxOp: is_tx_burst_available()
alt Burst Available
TxOp->>TxMgr: get_tx_packet_burst()
TxOp->>TxMgr: send_tx_burst()
TxMgr->>Network: transmit packets
else Burst Not Available
TxOp->>TxOp: sleep & retry (congestion)
end
Note over Network,RxMgr: Network Layer
Network->>RxMgr: receive packets burst
RxMgr->>BurstMgr: submit_next_packet()
alt Adaptive Dropping Check
BurstMgr->>BurstMgr: should_drop_burst_due_to_capacity()
BurstMgr->>BurstMgr: get_pool_utilization_percent()
alt Pool < 10% Critical
BurstMgr->>BurstMgr: enter critical dropping mode
BurstMgr->>BurstMgr: drop burst (return to pool)
else Pool >= 50% Recovery
BurstMgr->>BurstMgr: exit critical dropping mode
BurstMgr->>RxOp: enqueue burst to output queue
end
end
Note over RxOp,Viz: Receiver Pipeline
RxOp->>BurstMgr: get_rx_burst()
BurstMgr-->>RxOp: burst params
RxOp->>Assembler: process_burst()
Assembler->>Assembler: extract RTP packets
Assembler->>Assembler: assemble frame from packets
alt Frame Complete
Assembler->>RxOp: on_frame_completed()
RxOp->>RxOp: add to ready_frames queue
RxOp->>BurstMgr: free_all_packets_and_burst_rx()
alt Multiple Frames Ready
RxOp->>RxOp: drop older frames (keep latest)
end
RxOp->>Viz: emit VideoBuffer/Tensor entity
Viz->>Viz: visualize or write to file
end
62 files reviewed, no comments
ee4c2f0 to
b9582c5
Compare
There was a problem hiding this comment.
Greptile Overview
Greptile Summary
This PR focuses on the last 3 commits which implement adaptive burst pool management for the Rivermax advanced networking infrastructure:
Key Changes
Commit 4b974e3 - Adaptive burst pool management implementation:
- Added configurable threshold-based burst dropping to prevent memory pool exhaustion
- Three-tier threshold system: low capacity warning (25%), critical capacity dropping (10%), and recovery (50%)
- Statistics tracking for dropped bursts, capacity warnings, and critical events
- Hysteresis-based state machine to prevent oscillation between dropping/accepting modes
Commit f2c3564 - Lint fixes:
- Reformatted single-line conditionals to multi-line blocks for consistency
- Fixed line wrapping and indentation issues
Commit b9582c5 - Dockerfile updates:
- Added Python Holoscan package installation for test/development workflows
- Added visualization dependencies (libvulkan1, xvfb, x11vnc) for headless rendering
Issues Found
- Logic error in burst_manager.cpp:490-510 and 514-532: Drop statistics counters are incremented before
should_drop_burst_adaptive()determines whether to actually drop the burst, leading to inflated statistics when bursts are not dropped
Confidence Score: 4/5
- Safe to merge with the statistics counter bug fixed
- The adaptive burst pool management feature is well-designed with proper configuration plumbing through all layers. However, there's a logic error in statistics tracking where counters are incremented before the drop decision, causing inaccurate metrics. This won't affect functionality but will mislead monitoring. The lint fixes and Dockerfile updates are safe.
- operators/advanced_network/advanced_network/managers/rivermax/rivermax_mgr_impl/burst_manager.cpp requires fixing the statistics counter logic
Important Files Changed
File Analysis
| Filename | Score | Overview |
|---|---|---|
| operators/advanced_network/advanced_network/managers/rivermax/rivermax_mgr_impl/burst_manager.h | 5/5 | Added adaptive burst pool management infrastructure with configurable thresholds - well-designed API |
| operators/advanced_network/advanced_network/managers/rivermax/rivermax_mgr_impl/burst_manager.cpp | 3/5 | Implements adaptive burst dropping logic; statistics counters are incremented before drop decision is made, leading to inaccurate statistics |
| operators/advanced_network/advanced_network/managers/rivermax/rivermax_mgr_impl/rivermax_config_manager.cpp | 5/5 | Added parsing for burst pool adaptive dropping configuration from YAML with sensible defaults - implementation looks correct |
| operators/advanced_network/advanced_network/managers/rivermax/rivermax_mgr_impl/rivermax_mgr_service.cpp | 5/5 | Implements burst pool configuration application to burst managers in both IPO and RTP receiver services - correct implementation |
Sequence Diagram
sequenceDiagram
participant Config as YAML Config
participant Parser as RivermaxConfigParser
participant Builder as RXQueueBuilder
participant Service as RivermaxManagerRxService
participant Manager as RxBurstsManager
participant Pool as AnoBurstsMemoryPool
Note over Config,Manager: Configuration Phase
Config->>Parser: Parse burst_pool_adaptive_dropping settings
Parser->>Builder: Store thresholds in builder
Builder->>Service: Create service with config
Service->>Service: apply_burst_pool_configuration()
Service->>Manager: configure_pool_thresholds()
Service->>Manager: set_adaptive_burst_dropping(enabled)
Note over Manager,Pool: Runtime Burst Processing
Manager->>Manager: submit_next_packet()
Manager->>Manager: enqueue_and_reset_current_burst()
Manager->>Manager: should_drop_burst_due_to_capacity()
Manager->>Pool: get_pool_utilization_percent()
Pool-->>Manager: utilization %
alt utilization < critical_threshold
Manager->>Manager: should_drop_burst_adaptive()
Manager->>Manager: Enter critical dropping mode
Manager->>Manager: Drop burst (return to pool)
Manager->>Manager: Increment drop counters
else utilization < low_threshold
Manager->>Manager: should_drop_burst_adaptive()
Manager->>Manager: Check dropping policy
alt Should drop based on policy
Manager->>Manager: Drop burst
else Continue normal operation
Manager->>Manager: Enqueue burst to output queue
end
else utilization >= recovery_threshold
Manager->>Manager: Exit critical dropping mode
Manager->>Manager: Enqueue burst to output queue
end
62 files reviewed, 2 comments
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughAdds two Holohub applications (media player and sender), a new advanced_network_media operator suite (common, RX, TX) with C++ and Python bindings, extensive frame-assembly and memory-copy logic, Rivermax burst-pool adaptive-dropping support, configuration/metadata/docs, and CMake build updates across applications and operators. Changes
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes Possibly related PRs
Suggested reviewers
|
b9582c5 to
d9adc8f
Compare
There was a problem hiding this comment.
Greptile Overview
Greptile Summary
This PR implements adaptive burst pool management with configurable capacity thresholds for the Rivermax advanced network manager. The changes add monitoring and adaptive dropping capabilities to prevent memory pool exhaustion during high network load.
Key Changes
- Adaptive Burst Dropping: Implements a
CRITICAL_THRESHOLDpolicy that drops incoming bursts when memory pool capacity falls below critical levels and stops dropping when capacity recovers - Configurable Thresholds: Adds YAML configuration for
low_threshold_percent(25%),critical_threshold_percent(10%), andrecovery_threshold_percent(50%) - Statistics Tracking: Comprehensive monitoring with atomic counters for dropped bursts, capacity warnings, and critical events
- Configuration Propagation: Wires configuration from YAML → Parser → Builder → Service → BurstManager
- Dockerfile Updates: Adds Holoscan pip package and visualization libraries (Vulkan, Xvfb, x11vnc) for headless rendering
Issues Found
The implementation has a critical logic error in burst_manager.cpp:490-510 where drop statistics counters are incremented BEFORE the actual drop decision is made. This inflates the drop counters even when bursts are not actually dropped, making statistics unreliable for monitoring and debugging.
Confidence Score: 2/5
- This PR should NOT be merged until the counter logic bug is fixed - incorrect statistics will make production debugging extremely difficult
- Score reflects critical logic error in burst dropping statistics that inflates drop counters incorrectly. While the overall architecture and configuration plumbing are sound, the bug in the core adaptive dropping logic will make the feature unreliable in production. The statistics are meant to inform operators about system behavior under load, but they will be misleading.
- operators/advanced_network/advanced_network/managers/rivermax/rivermax_mgr_impl/burst_manager.cpp requires immediate attention - the counter increment logic must be fixed before merge
Important Files Changed
File Analysis
| Filename | Score | Overview |
|---|---|---|
| operators/advanced_network/advanced_network/managers/rivermax/rivermax_mgr_impl/burst_manager.cpp | 2/5 | Implements adaptive burst pool management with configurable thresholds. Contains logic error where drop counters are incremented before actual drop decision, inflating statistics incorrectly. |
| operators/advanced_network/advanced_network/managers/rivermax/rivermax_mgr_impl/burst_manager.h | 4/5 | Adds burst pool monitoring infrastructure with thresholds, statistics tracking, and adaptive dropping policies. Clean interface design with proper documentation. |
| operators/advanced_network/advanced_network/managers/rivermax/rivermax_mgr_impl/rivermax_config_manager.cpp | 5/5 | Adds YAML parsing for burst pool adaptive dropping configuration with proper defaults and logging. Mostly formatting/lint fixes otherwise. |
| operators/advanced_network/Dockerfile | 5/5 | Adds Holoscan pip package, fixes permissions for dist-info, and adds visualization libraries (Vulkan, Xvfb, x11vnc) for headless rendering support. |
Sequence Diagram
sequenceDiagram
participant Config as YAML Config
participant Parser as RivermaxConfigParser
participant Builder as QueueConfigBuilder
participant Service as RxService
participant Manager as RxBurstsManager
participant Pool as AnoBurstsMemoryPool
Config->>Parser: Parse burst_pool_adaptive_dropping config
Parser->>Builder: Set thresholds (low/critical/recovery %)
Builder->>Service: Copy configuration to service
Service->>Manager: apply_burst_pool_configuration()
Manager->>Manager: configure_pool_thresholds()
Manager->>Manager: set_adaptive_burst_dropping(enabled)
Note over Manager,Pool: Runtime Burst Processing
Manager->>Manager: should_drop_burst_due_to_capacity()
Manager->>Pool: get_pool_utilization_percent()
Pool-->>Manager: utilization %
alt Utilization < critical_threshold
Manager->>Manager: Increment counters (bursts_dropped_critical_capacity)
Manager->>Manager: should_drop_burst_adaptive(utilization)
alt in_critical_dropping_mode OR utilization < critical
Manager->>Manager: Enter/stay in critical mode
Manager-->>Service: return true (drop burst)
else utilization >= recovery_threshold
Manager->>Manager: Exit critical mode
Manager-->>Service: return false (keep burst)
end
else Utilization < low_threshold
Manager->>Manager: Increment counters (bursts_dropped_low_capacity)
Manager->>Manager: should_drop_burst_adaptive(utilization)
Manager-->>Service: return drop decision
else Utilization >= low_threshold
Manager-->>Service: return false (keep burst)
end
62 files reviewed, 2 comments
d6a01c2 to
9e9a0fd
Compare
|
Caution Review failedFailed to post review comments WalkthroughThis PR introduces two new advanced networking media applications (player and sender) with corresponding RX/TX operators. It adds comprehensive frame assembly, memory copy strategy detection, RTP packet processing, and adaptive burst pool management for high-performance network media streaming. The PR enhances the existing Rivermax manager with burst pool capacity monitoring and adaptive dropping policies. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~75 minutes Areas requiring extra attention:
Possibly related PRs
Suggested reviewers
Pre-merge checks✅ Passed checks (2 passed)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Greptile Overview
Greptile Summary
This PR introduces advanced network media operators and sample applications for SMPTE 2110 compliant streaming using Rivermax. The implementation provides a clean abstraction for media streaming over high-performance network infrastructure.
Key Components Added:
AdvNetworkMediaRxOp: Receives media frames from network, handles burst-to-frame conversion with adaptive memory copy strategiesAdvNetworkMediaTxOp: Transmits VideoBuffer/Tensor frames over network with validation and error handling- Sample applications demonstrating both sender and receiver pipelines with visualization and file output options
- Python bindings for both operators enabling dual-language support
Architecture Strengths:
- Well-structured frame assembly logic with state machine pattern for RX path
- Comprehensive error handling and validation throughout
- Flexible output formats (VideoBuffer/Tensor) and memory locations (host/device)
- Clean separation of concerns between operators, frame buffers, and network management
Note on Last 3 Commits:
The last 3 commits focus on burst pool management enhancements in the advanced_network module. Previous comments have identified counter increment issues in burst_manager.cpp:490 and burst_manager.cpp:514 where statistics are incremented before the actual drop decision is made.
Confidence Score: 4/5
- This PR is safe to merge with addressing the burst manager counter logic issues already identified in previous comments
- Score of 4 reflects well-structured operators and applications with comprehensive error handling. The deduction is due to the pre-existing counter increment logic issues in burst_manager.cpp (lines 490, 514) where statistics are incremented before drop decisions are finalized, which was already flagged in previous comments. The new media operators themselves are well-implemented.
- operators/advanced_network/advanced_network/managers/rivermax/rivermax_mgr_impl/burst_manager.cpp - Address counter increment timing issues at lines 490 and 514
Important Files Changed
File Analysis
| Filename | Score | Overview |
|---|---|---|
| operators/advanced_network_media/advanced_network_media_rx/adv_network_media_rx.cpp | 4/5 | RX operator implementation with frame assembly logic, burst processing, and statistics tracking. Generally well-structured with comprehensive error handling. |
| operators/advanced_network_media/advanced_network_media_tx/adv_network_media_tx.cpp | 5/5 | TX operator implementation handling VideoBuffer and Tensor inputs with frame validation. Clean implementation with proper error handling. |
| applications/adv_networking_media_player/cpp/adv_networking_media_player.cpp | 5/5 | Media player application demonstrating RX operator usage with visualization and file output options. Well-structured with proper initialization and cleanup. |
| applications/adv_networking_media_sender/cpp/adv_networking_media_sender.cpp | 5/5 | Media sender application demonstrating TX operator usage with video stream replay. Clean and straightforward implementation. |
| operators/advanced_network/advanced_network/managers/rivermax/rivermax_mgr_impl/burst_manager.cpp | 3/5 | Burst pool management with adaptive burst dropping. Contains logic issues with counter increments happening before drop decisions (see existing comments). |
Sequence Diagram
sequenceDiagram
participant App as Media Sender App
participant Replayer as VideoStreamReplayer
participant TxOp as AdvNetworkMediaTxOp
participant ANM as Advanced Network Manager
participant RDK as Rivermax SDK
participant NIC as Network Interface
App->>Replayer: Start playback
Replayer->>TxOp: VideoBuffer/Tensor Entity
TxOp->>TxOp: Validate frame parameters
TxOp->>TxOp: Create MediaFrame wrapper
TxOp->>ANM: create_tx_burst_params()
TxOp->>ANM: is_tx_burst_available()
alt Burst Available
TxOp->>ANM: get_tx_packet_burst()
TxOp->>ANM: Attach MediaFrame via custom_pkt_data
TxOp->>ANM: send_tx_burst()
ANM->>RDK: MediaSender service processing
RDK->>RDK: RTP packetization (SMPTE 2110)
RDK->>NIC: DMA transfer to hardware queues
NIC->>NIC: Network transmission
else Burst Not Available
TxOp->>TxOp: Sleep and retry
end
participant Net as Network
participant RxNIC as RX Network Interface
participant RxRDK as RX Rivermax SDK
participant RxANM as RX Advanced Network Manager
participant RxOp as AdvNetworkMediaRxOp
participant Assembler as MediaFrameAssembler
participant PlayerApp as Media Player App
participant Holoviz as HolovizOp
Net->>RxNIC: Incoming packets
RxNIC->>RxRDK: DMA to memory regions
RxRDK->>RxANM: Burst assembly (packet pointers)
RxANM->>RxOp: get_rx_burst()
RxOp->>Assembler: process_burst()
Assembler->>Assembler: State machine processing
Assembler->>Assembler: Memory copy strategy selection
alt Frame Complete
Assembler->>RxOp: on_frame_completed()
RxOp->>RxOp: Add to ready_frames queue
RxOp->>RxANM: free_all_packets_and_burst_rx()
RxOp->>PlayerApp: Emit VideoBuffer/Tensor Entity
alt Visualization Mode
PlayerApp->>Holoviz: Display frame
else File Output Mode
PlayerApp->>PlayerApp: Write to file
end
end
62 files reviewed, no comments
9e9a0fd to
41d332b
Compare
There was a problem hiding this comment.
Greptile Overview
Greptile Summary
This PR's last 3 commits focus on implementing adaptive burst pool management for the Rivermax network manager, along with Dockerfile updates and code formatting improvements.
Key Changes:
- Adaptive Burst Dropping: Introduces configurable threshold-based burst dropping to handle memory pool pressure with three levels (critical, low, recovery)
- Pool Capacity Monitoring: Adds comprehensive statistics tracking and periodic logging for burst pool utilization
- Dockerfile Enhancements: Installs Holoscan Python package and visualization dependencies (libvulkan1, xvfb, x11vnc) to support the new media player applications
- Code Quality: Applies consistent formatting and linting fixes across Rivermax implementation files
Critical Issue Found:
The should_drop_burst_due_to_capacity() function in burst_manager.cpp:490-533 has a logic error where drop counters are incremented before the actual drop decision is made by should_drop_burst_adaptive(). When the system is in the recovery range (utilization between critical and recovery thresholds), should_drop_burst_adaptive() returns false but the counters have already been incremented, leading to inflated drop statistics.
Confidence Score: 2/5
- This PR contains a critical logic bug in burst dropping statistics that will cause incorrect monitoring and metrics
- The adaptive burst pool management implementation has a fundamental flaw where counters are incremented before the actual drop decision. This affects both critical and low capacity paths. While the feature itself is well-structured with good logging and configuration, the incorrect statistics will mislead operators about system behavior and make debugging difficult. The bug needs to be fixed before merge.
- Pay close attention to
operators/advanced_network/advanced_network/managers/rivermax/rivermax_mgr_impl/burst_manager.cpplines 490-533 where the counter increment logic needs to be moved after the drop decision
Important Files Changed
File Analysis
| Filename | Score | Overview |
|---|---|---|
| operators/advanced_network/Dockerfile | 5/5 | Added Holoscan Python package installation and visualization dependencies (libvulkan1, xvfb, x11vnc) for media player application support |
| operators/advanced_network/advanced_network/managers/rivermax/rivermax_mgr_impl/burst_manager.cpp | 2/5 | Implements adaptive burst pool management with configurable thresholds. Contains critical logic error where drop counters are incremented before determining if burst should actually be dropped. |
| operators/advanced_network/advanced_network/managers/rivermax/rivermax_mgr_impl/burst_manager.h | 4/5 | Adds new member variables and methods for adaptive burst dropping with pool capacity monitoring |
| operators/advanced_network/advanced_network/managers/rivermax/rivermax_mgr_impl/rivermax_mgr_service.cpp | 4/5 | Integrates adaptive burst dropping into the receive flow with capacity checks |
Sequence Diagram
sequenceDiagram
participant App as Application
participant RxMgr as RxBurstsManager
participant Pool as AnoBurstsMemoryPool
participant Assembler as MediaFrameAssembler
participant Op as AdvNetworkMediaRxOp
participant Holoviz as HolovizOp/FramesWriter
Note over RxMgr,Pool: Initialization Phase
App->>RxMgr: Initialize with pool size
RxMgr->>Pool: Create burst pool (DEFAULT_NUM_RX_BURSTS)
RxMgr->>RxMgr: Initialize adaptive dropping thresholds
Note over RxMgr,Assembler: Reception & Processing Loop
RxMgr->>RxMgr: Check pool utilization
alt Utilization < Critical Threshold
RxMgr->>RxMgr: Increment drop counters
RxMgr->>RxMgr: should_drop_burst_adaptive()
alt Should Drop
RxMgr-->>RxMgr: Drop burst
else Should Not Drop (Recovery Range)
Note right of RxMgr: BUG: Counters already incremented!
RxMgr->>Pool: Dequeue burst from pool
end
else Utilization < Low Threshold
RxMgr->>RxMgr: Increment drop counters
RxMgr->>RxMgr: should_drop_burst_adaptive()
alt Should Drop
RxMgr-->>RxMgr: Drop burst
else Should Not Drop
Note right of RxMgr: BUG: Counters already incremented!
RxMgr->>Pool: Dequeue burst from pool
end
else Healthy Capacity
RxMgr->>Pool: Dequeue burst from pool
end
Pool-->>RxMgr: Return burst
RxMgr->>Assembler: Process network burst
Assembler->>Assembler: Convert packets to frame data
Assembler->>Op: on_frame_completed callback
Op->>Op: Emit VideoBuffer or Tensor
Op->>Holoviz: Send frame for visualization/writing
Holoviz->>Holoviz: Display or write to file
Note over RxMgr,Pool: Cleanup Phase
Op->>RxMgr: Return processed burst
RxMgr->>Pool: Enqueue burst back to pool
62 files reviewed, no comments
41d332b to
599f8ec
Compare
There was a problem hiding this comment.
Greptile Overview
Greptile Summary
Implements adaptive burst pool management for Rivermax network operations with configurable capacity thresholds. The feature monitors memory pool utilization and drops bursts when capacity falls below critical levels to prevent resource exhaustion.
Key changes:
- Increased default burst pool size from 64 to 256 bursts for better throughput
- Added three-tier threshold system: low (25%), critical (10%), and recovery (50%)
- Implemented
CRITICAL_THRESHOLDdrop policy that enters dropping mode at critical capacity and exits at recovery threshold - Added comprehensive statistics tracking for dropped bursts and capacity events
- Configured through YAML with
burst_pool_adaptive_dropping_enabledand threshold parameters
Issue found:
burst_manager.cpp:490-510and514-532: Drop statistics counters are incremented beforeshould_drop_burst_adaptive()determines if burst should actually be dropped, leading to inflated drop statistics when the system is in the low-capacity-but-recovering range
Confidence Score: 3/5
- Safe to merge with one logic error in statistics tracking that inflates drop counters
- The adaptive burst dropping feature is well-designed with proper configuration, monitoring, and logging. However, the statistics tracking bug (incrementing counters before actual drop decision) will report inaccurate metrics. This doesn't affect functional correctness but undermines observability and debugging capabilities for production deployments.
- Pay close attention to
burst_manager.cpp:490-510and514-532where drop statistics are incremented prematurely
Important Files Changed
File Analysis
| Filename | Score | Overview |
|---|---|---|
| operators/advanced_network/advanced_network/managers/rivermax/rivermax_mgr_impl/burst_manager.cpp | 2/5 | Implements adaptive burst pool management with configurable thresholds. Contains logic error where drop statistics are incremented before actual drop decision is made. |
| operators/advanced_network/advanced_network/managers/rivermax/rivermax_mgr_impl/burst_manager.h | 4/5 | Adds pool monitoring infrastructure, configurable thresholds (low=25%, critical=10%, recovery=50%), and statistics tracking. Increases default pool size from 64 to 256 bursts. |
| operators/advanced_network/advanced_network/managers/rivermax/rivermax_mgr_impl/rivermax_queue_configs.h | 5/5 | Adds burst pool adaptive dropping configuration fields to RX queue configs with default thresholds matching burst_manager. |
Sequence Diagram
sequenceDiagram
participant App as Application
participant RMgr as RxBurstsManager
participant Pool as AnoBurstsMemoryPool
participant Queue as AnoBurstsQueue
participant Adaptive as Adaptive Drop Logic
Note over RMgr: Initialize with pool_size=256<br/>critical=10%, low=25%, recovery=50%
App->>RMgr: submit_next_packet(packet_data)
RMgr->>RMgr: get_or_allocate_current_burst()
RMgr->>Pool: dequeue_burst()
alt Pool has available bursts
Pool-->>RMgr: return burst
RMgr->>RMgr: append_packet()
alt Burst full
RMgr->>RMgr: enqueue_and_reset_current_burst()
RMgr->>Adaptive: should_drop_burst_due_to_capacity()
Adaptive->>Adaptive: get_pool_utilization_percent()
alt utilization < critical (10%)
Note over Adaptive: INCREMENT COUNTERS<br/>(BEFORE decision!)
Adaptive->>Adaptive: should_drop_burst_adaptive()
Adaptive-->>RMgr: true (drop burst)
RMgr->>Pool: enqueue_burst() [return to pool]
else utilization < low (25%)
Note over Adaptive: INCREMENT COUNTERS<br/>(BEFORE decision!)
Adaptive->>Adaptive: should_drop_burst_adaptive()
alt in_critical_dropping_mode == true
Adaptive-->>RMgr: true (drop burst)
RMgr->>Pool: enqueue_burst() [return to pool]
else in_critical_dropping_mode == false
Adaptive-->>RMgr: false (keep burst)
Note over Adaptive: BUG: Counters already<br/>incremented but no drop!
RMgr->>Queue: enqueue_burst() [to output]
end
else utilization >= low (25%)
Adaptive-->>RMgr: false (keep burst)
RMgr->>Queue: enqueue_burst() [to output]
end
end
else Pool exhausted
Pool-->>RMgr: nullptr
Note over RMgr: NO_FREE_BURST_BUFFERS
end
App->>RMgr: get_rx_burst()
RMgr->>Queue: dequeue_burst()
Queue-->>App: return burst
App->>RMgr: rx_burst_done(burst)
RMgr->>Pool: enqueue_burst() [return to pool]
62 files reviewed, 2 comments
599f8ec to
0de6c57
Compare
Greptile SummaryThis PR introduces
Confidence Score: 4/5Safe to merge after addressing the frame-pool exhaustion crash path; all other findings are P2. One P1 defect remains: pool exhaustion causes a pipeline crash instead of graceful degradation. The fix is localised (change throw to return nullptr in get_allocated_frame). All prior P1/P0 concerns from previous review threads are marked fixed. operators/advanced_network_media/advanced_network_media_rx/adv_network_media_rx.cpp (get_allocated_frame pool-exhaustion path) Important Files Changed
Sequence DiagramsequenceDiagram
participant Net as Network Interface
participant RxOp as AdvNetworkMediaRxOp
participant Assembler as MediaFrameAssembler
participant Pool as FramePool (250 slots)
participant Down as Downstream (HolovizOp / Writer)
Net->>RxOp: get_rx_burst()
RxOp->>Assembler: process_burst(burst)
Assembler->>Pool: get_new_frame() [throws if empty!]
Pool-->>Assembler: FrameBuffer
Assembler-->>RxOp: frame completed
RxOp->>RxOp: wrap in GXF Entity (release_func → Pool)
RxOp->>Down: emit(entity)
Down-->>Pool: release_func() returns frame
Note over RxOp,Down: TX path
participant Replayer as VideoStreamReplayerOp
participant TxOp as AdvNetworkMediaTxOp
Replayer->>TxOp: VideoBuffer/Tensor entity
TxOp->>TxOp: validate & create MediaFrame
TxOp->>Net: send_tx_burst()
Greploops — Automatically fix all review issues by running Reviews (15): Last reviewed commit: "Fix Rivermax build errors" | Re-trigger Greptile |
There was a problem hiding this comment.
Actionable comments posted: 25
♻️ Duplicate comments (2)
operators/advanced_network/advanced_network/managers/rivermax/rivermax_mgr_impl/burst_manager.cpp (2)
490-510: Critical: Counters incremented before drop decision.The counters
bursts_dropped_critical_capacity_,total_bursts_dropped_, andpool_capacity_critical_events_are incremented on lines 491-493 before callingshould_drop_burst_adaptive()on line 510. Whenshould_drop_burst_adaptive()returnsfalse(e.g., during recovery whenutilization >= pool_recovery_threshold_percent_but still belowpool_critical_threshold_percent_), the burst is not actually dropped but the counters were already incremented. This inflates drop statistics.This is a duplicate of an issue flagged in previous reviews.
Move counter increments after confirming the drop:
// Critical capacity - definitely drop with video-aware logic if (utilization < pool_critical_threshold_percent_) { - bursts_dropped_critical_capacity_++; - total_bursts_dropped_++; - pool_capacity_critical_events_++; - auto now = std::chrono::steady_clock::now(); auto time_since_last = std::chrono::duration_cast<std::chrono::milliseconds>(now - last_capacity_critical_time_) .count(); if (time_since_last > 1000) { // Log every second HOLOSCAN_LOG_ERROR( "CRITICAL: Pool capacity at {}% - dropping new bursts only (port={}, queue={})", utilization, port_id_, queue_id_); last_capacity_critical_time_ = now; } // In critical mode, use adaptive dropping but be more aggressive - return should_drop_burst_adaptive(utilization); + bool should_drop = should_drop_burst_adaptive(utilization); + if (should_drop) { + bursts_dropped_critical_capacity_++; + total_bursts_dropped_++; + pool_capacity_critical_events_++; + } + return should_drop; }
514-533: Critical: Counters incremented before drop decision.Same issue as the critical capacity path - counters
bursts_dropped_low_capacity_,total_bursts_dropped_, andpool_capacity_warnings_are incremented on lines 515-517 beforeshould_drop_burst_adaptive()determines whether to actually drop. This inflates statistics when bursts are not dropped.This is a duplicate of an issue flagged in previous reviews.
Move counter increments after confirming the drop:
// Low capacity - use adaptive dropping policies if (utilization < pool_low_threshold_percent_) { - bursts_dropped_low_capacity_++; - total_bursts_dropped_++; - pool_capacity_warnings_++; - auto now = std::chrono::steady_clock::now(); auto time_since_last = std::chrono::duration_cast<std::chrono::milliseconds>(now - last_capacity_warning_time_) .count(); if (time_since_last > 5000) { // Log every 5 seconds HOLOSCAN_LOG_WARN("LOW: Pool capacity at {}% - adaptive burst dropping (port={}, queue={})", utilization, port_id_, queue_id_); last_capacity_warning_time_ = now; } - return should_drop_burst_adaptive(utilization); + bool should_drop = should_drop_burst_adaptive(utilization); + if (should_drop) { + bursts_dropped_low_capacity_++; + total_bursts_dropped_++; + pool_capacity_warnings_++; + } + return should_drop; }
🧹 Nitpick comments (6)
operators/advanced_network/advanced_network/managers/rivermax/rivermax_mgr_impl/rivermax_config_manager.cpp (1)
564-588: Consider validating threshold relationships.The parsing logic correctly handles optional configuration with sensible defaults. However, there's no validation that the thresholds maintain logical relationships (e.g.,
recovery_threshold > critical_threshold,low_threshold > critical_threshold). Invalid threshold configurations could lead to unexpected adaptive dropping behavior at runtime.Consider adding validation after parsing:
rivermax_rx_config.burst_pool_recovery_threshold_percent = burst_pool_config["recovery_threshold_percent"].as<uint32_t>(50); + // Validate threshold relationships + if (rivermax_rx_config.burst_pool_critical_threshold_percent >= + rivermax_rx_config.burst_pool_low_threshold_percent) { + HOLOSCAN_LOG_WARN("Critical threshold ({}%) should be less than low threshold ({}%)", + rivermax_rx_config.burst_pool_critical_threshold_percent, + rivermax_rx_config.burst_pool_low_threshold_percent); + } + if (rivermax_rx_config.burst_pool_recovery_threshold_percent <= + rivermax_rx_config.burst_pool_critical_threshold_percent) { + HOLOSCAN_LOG_WARN("Recovery threshold ({}%) should be greater than critical threshold ({}%)", + rivermax_rx_config.burst_pool_recovery_threshold_percent, + rivermax_rx_config.burst_pool_critical_threshold_percent); + } HOLOSCAN_LOG_INFO(operators/advanced_network_media/common/adv_network_media_common.h (1)
27-39: Wrap macro in do-while(0) for safety.The macro should use the do-while(0) pattern to prevent issues with if-else chains and ensure it behaves like a single statement.
Apply this diff:
-#define CUDA_TRY(stmt) \ - { \ - cudaError_t cuda_status = stmt; \ - if (cudaSuccess != cuda_status) { \ - ANM_LOG_ERROR("Runtime call {} in line {} of file {} failed with '{}' ({})", \ - #stmt, \ - __LINE__, \ - __FILE__, \ - cudaGetErrorString(cuda_status), \ - static_cast<int>(cuda_status)); \ - throw std::runtime_error("CUDA operation failed"); \ - } \ - } +#define CUDA_TRY(stmt) \ + do { \ + cudaError_t cuda_status = stmt; \ + if (cudaSuccess != cuda_status) { \ + ANM_LOG_ERROR("Runtime call {} in line {} of file {} failed with '{}' ({})", \ + #stmt, \ + __LINE__, \ + __FILE__, \ + cudaGetErrorString(cuda_status), \ + static_cast<int>(cuda_status)); \ + throw std::runtime_error("CUDA operation failed"); \ + } \ + } while (0)applications/adv_networking_media_sender/cpp/adv_networking_media_sender.cpp (3)
17-17: Remove unused include.The
assert.hheader is included but never used in this file.Apply this diff:
-#include <assert.h> #include <sys/time.h>
43-46: Remove duplicate log statement.Lines 43 and 46 log essentially identical messages about the Advanced Network manager type.
Apply this diff to remove the duplicate:
HOLOSCAN_LOG_INFO("Using Advanced Network manager {}", advanced_network::manager_type_to_string(mgr_type)); - HOLOSCAN_LOG_INFO("Using ANO manager {}", advanced_network::manager_type_to_string(mgr_type)); if (!tx_en) {
33-50: Consider returning error codes instead of calling exit().Calling
exit(1)from withincompose()terminates the entire process abruptly. Consider propagating errors tomain()via exceptions or return codes for more graceful shutdown and better testability.operators/advanced_network_media/common/frame_buffer.h (1)
31-33: Removeusing namespacefrom the headerHeader-wide
using namespace …;pollutes every translation unit that includes this file and risks name clashes. Please switch to explicit aliases for the specific symbols you need.-using namespace rivermax::dev_kit::services; -using namespace holoscan::advanced_network; +using rivermax::dev_kit::services::IFrameBuffer; +using holoscan::advanced_network::MemoryLocation;
0de6c57 to
2e6b99f
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (1)
operators/advanced_network/advanced_network/managers/rivermax/rivermax_mgr_impl/burst_manager.h (1)
548-562: Remove the extratotal_bursts_dropped_++increment.
should_drop_burst_due_to_capacity()already updates all drop counters (includingtotal_bursts_dropped_). Incrementing again inenqueue_and_reset_current_burst()doubles every drop. Please drop the redundant increment so statistics stay accurate.if (should_drop_burst_due_to_capacity()) { rx_bursts_mempool_->enqueue_burst(cur_out_burst_); reset_current_burst(); - total_bursts_dropped_++; return Status::SUCCESS; }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (9)
operators/advanced_network/Dockerfile(2 hunks)operators/advanced_network/advanced_network/managers/rivermax/rivermax_mgr_impl/adv_network_rivermax_mgr.cpp(10 hunks)operators/advanced_network/advanced_network/managers/rivermax/rivermax_mgr_impl/burst_manager.cpp(10 hunks)operators/advanced_network/advanced_network/managers/rivermax/rivermax_mgr_impl/burst_manager.h(10 hunks)operators/advanced_network/advanced_network/managers/rivermax/rivermax_mgr_impl/rivermax_config_manager.cpp(10 hunks)operators/advanced_network/advanced_network/managers/rivermax/rivermax_mgr_impl/rivermax_mgr_service.cpp(13 hunks)operators/advanced_network/advanced_network/managers/rivermax/rivermax_mgr_impl/rivermax_mgr_service.h(3 hunks)operators/advanced_network/advanced_network/managers/rivermax/rivermax_mgr_impl/rivermax_queue_configs.cpp(15 hunks)operators/advanced_network/advanced_network/managers/rivermax/rivermax_mgr_impl/rivermax_queue_configs.h(3 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- operators/advanced_network/Dockerfile
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: CR
Repo: nvidia-holoscan/holohub PR: 0
File: CONTRIBUTING.md:0-0
Timestamp: 2025-10-22T16:16:16.226Z
Learning: Applies to applications/CMakeLists.txt : Register applications in applications/CMakeLists.txt using add_holohub_application(...) and optional DEPENDS OPERATORS
Learnt from: CR
Repo: nvidia-holoscan/holohub PR: 0
File: CONTRIBUTING.md:0-0
Timestamp: 2025-10-22T16:16:16.226Z
Learning: Applies to workflow/CMakeLists.txt : Register workflows in workflow/CMakeLists.txt using add_holohub_application(...) and optional DEPENDS OPERATORS
Learnt from: CR
Repo: nvidia-holoscan/holohub PR: 0
File: CONTRIBUTING.md:0-0
Timestamp: 2025-10-22T16:16:16.226Z
Learning: Applies to operators/CMakeLists.txt : Register operators in operators/CMakeLists.txt using add_holohub_operator(...) and optional DEPENDS EXTENSIONS
Learnt from: CR
Repo: nvidia-holoscan/holohub PR: 0
File: CONTRIBUTING.md:0-0
Timestamp: 2025-10-22T16:16:16.226Z
Learning: Applies to pkg/CMakeLists.txt : Register packages in pkg/CMakeLists.txt using add_holohub_package(...)
🧬 Code graph analysis (7)
operators/advanced_network/advanced_network/managers/rivermax/rivermax_mgr_impl/rivermax_mgr_service.cpp (3)
operators/advanced_network/advanced_network/managers/rivermax/rivermax_mgr_impl/adv_network_rivermax_mgr.cpp (24)
burst(99-99)burst(100-100)burst(101-101)burst(102-102)burst(103-103)burst(104-104)burst(105-105)burst(106-107)burst(108-109)burst(110-110)burst(111-111)burst(113-113)burst(114-114)burst(115-115)burst(116-116)burst(117-117)shutdown(561-595)shutdown(561-561)shutdown(800-802)shutdown(800-800)free_tx_burst(504-516)free_tx_burst(504-504)free_tx_burst(772-774)free_tx_burst(772-772)operators/advanced_network/advanced_network/common.cpp (4)
shutdown(280-283)shutdown(280-280)free_tx_burst(250-253)free_tx_burst(250-250)operators/advanced_network/advanced_network/managers/rivermax/rivermax_mgr_impl/rivermax_mgr_service.h (1)
free_tx_burst(445-445)
operators/advanced_network/advanced_network/managers/rivermax/rivermax_mgr_impl/rivermax_queue_configs.cpp (1)
operators/advanced_network/advanced_network/managers/rivermax/rivermax_mgr_impl/rivermax_config_manager.cpp (2)
validate_cores(777-797)validate_cores(777-777)
operators/advanced_network/advanced_network/managers/rivermax/rivermax_mgr_impl/rivermax_mgr_service.h (3)
operators/advanced_network/advanced_network/managers/rivermax/rivermax_mgr_impl/burst_manager.h (2)
shared_ptr(531-542)RxBurstsManager(317-522)operators/advanced_network/advanced_network/managers/rivermax/rivermax_mgr_impl/burst_manager.cpp (2)
RxBurstsManager(392-428)RxBurstsManager(430-443)operators/advanced_network/advanced_network/managers/rivermax/rivermax_mgr_impl/rivermax_mgr_service.cpp (2)
apply_burst_pool_configuration(74-90)apply_burst_pool_configuration(74-74)
operators/advanced_network/advanced_network/managers/rivermax/rivermax_mgr_impl/burst_manager.cpp (1)
operators/advanced_network/advanced_network/managers/rivermax/rivermax_mgr_impl/burst_manager.h (1)
RxBurstsManager(317-522)
operators/advanced_network/advanced_network/managers/rivermax/rivermax_mgr_impl/adv_network_rivermax_mgr.cpp (1)
operators/advanced_network/advanced_network/managers/rivermax/rivermax_mgr_impl/burst_manager.h (2)
burst_port_id_from_burst_tag(69-71)burst_queue_id_from_burst_tag(79-81)
operators/advanced_network/advanced_network/managers/rivermax/rivermax_mgr_impl/rivermax_config_manager.cpp (1)
operators/advanced_network/advanced_network/types.h (1)
GetMemoryKindFromString(106-118)
operators/advanced_network/advanced_network/managers/rivermax/rivermax_mgr_impl/burst_manager.h (1)
operators/advanced_network/advanced_network/managers/rivermax/rivermax_mgr_impl/burst_manager.cpp (10)
get_pool_status_string(445-464)get_pool_status_string(445-445)get_burst_drop_statistics(466-474)get_burst_drop_statistics(466-466)should_drop_burst_due_to_capacity(476-540)should_drop_burst_due_to_capacity(476-476)log_pool_capacity_status(542-556)log_pool_capacity_status(542-542)should_drop_burst_adaptive(558-595)should_drop_burst_adaptive(558-558)
There was a problem hiding this comment.
Actionable comments posted: 11
🤖 Fix all issues with AI agents
In
`@applications/adv_networking_media_player/cpp/adv_networking_media_player.cpp`:
- Around line 268-311: In compose(), replace the three exit(1) calls with thrown
std::runtime_error so destructors and framework cleanup can run: when
ano::adv_net_init(adv_net_config) fails (currently logs "Failed to configure the
Advanced Network manager"), throw std::runtime_error with that message and any
returned error details; when rx_en is false (currently logs "Rx is not
enabled..."), throw std::runtime_error with that message; and when neither
visualize nor write_to_file is enabled (currently logs "At least one output
type..."), throw std::runtime_error with that message—update compose(), and keep
the existing HOLOSCAN_LOG_* calls but remove exit(1) invocations so callers can
catch the exceptions.
In `@applications/adv_networking_media_sender/adv_networking_media_sender.yaml`:
- Around line 33-40: The comment above the rmm_allocator block is misleading
(mentions "8 MB" and a 1920×1080 RGBA image) while the actual allocations
(device_memory_initial_size, device_memory_max_size, host_memory_initial_size,
host_memory_max_size) are set to "1024 MB"; update or remove that comment so it
accurately reflects the configured sizes—either change the comment to state
"Initial size set to 1024 MB" (or explain why 1024 MB is chosen) or delete the
outdated line to avoid confusion when inspecting rmm_allocator and its
device/host memory size fields.
In
`@applications/adv_networking_media_sender/cpp/adv_networking_media_sender.cpp`:
- Line 17: Remove the unused C-style include by deleting the line that includes
<assert.h> from adv_networking_media_sender.cpp; if you later need assertions,
replace it with the C++ header <cassert> instead of <assert.h> so static
analysis no longer reports the unused include.
In
`@operators/advanced_network_media/advanced_network_media_rx/adv_network_media_rx.cpp`:
- Around line 703-711: The destructor of AdvNetworkMediaRxOp manually deletes
pimpl_ which is a std::unique_ptr, causing a double-delete; locate
AdvNetworkMediaRxOp::~AdvNetworkMediaRxOp() and remove the explicit delete of
pimpl_, leaving either an empty destructor (so unique_ptr auto-deletes) or, if
you need to call pimpl_->cleanup() first, call pimpl_->cleanup() then
pimpl_.reset() (do not use delete pimpl_).
In
`@operators/advanced_network_media/advanced_network_media_rx/python/adv_network_media_rx_pybind.cpp`:
- Around line 64-80: The binding's constructor PyAdvNetworkMediaRxOp leaves the
required parameter declared in setup() ("interface_name") unset when defaulting
to "" and only conditionally calling add_arg; fix by either making
interface_name required in the Python signature (remove the "" default from
PyAdvNetworkMediaRxOp) or always add the argument to the spec regardless of
emptiness (call this->add_arg(Arg("interface_name", interface_name))
unconditionally) so the OperatorSpec created in setup() receives the required
MetaParameter; update references in PyAdvNetworkMediaRxOp, setup(), and the
conditional add_arg logic accordingly.
In
`@operators/advanced_network_media/advanced_network_media_rx/python/adv_network_media_rx_pydoc.hpp`:
- Around line 60-66: The doc for the parameter output_format is missing the
allowed values; update the docstring for output_format in
adv_network_media_rx_pydoc.hpp to list both supported options ("video_buffer"
and "tensor") and retain the default indication (default: "video_buffer"), so
users know valid choices for output_format in the operator API.
In
`@operators/advanced_network_media/advanced_network_media_tx/adv_network_media_tx.cpp`:
- Around line 276-284: The destructor AdvNetworkMediaTxOp::~AdvNetworkMediaTxOp
currently manually deletes pimpl_, but pimpl_ is a
std::unique_ptr<AdvNetworkMediaTxOpImpl>, causing a double-delete; remove the
manual delete and nullptr assignment and either default the destructor or leave
it empty so the unique_ptr cleans up automatically; update the
AdvNetworkMediaTxOp::~AdvNetworkMediaTxOp implementation (or use = default) and
ensure pimpl_ is only managed via the unique_ptr member.
In `@operators/advanced_network_media/common/adv_network_media_common.h`:
- Around line 27-39: The CUDA_TRY macro is defined with bare braces which can
cause dangling-else / statement grouping issues; wrap the macro body in the
standard do { ... } while (0) idiom so it behaves like a single statement in all
contexts. Modify the CUDA_TRY definition (keeping its internals: cudaError_t
cuda_status = stmt; the ANM_LOG_ERROR call with cudaGetErrorString(cuda_status)
and the throw std::runtime_error) to be enclosed in do { /* existing code */ }
while (0) and leave the macro name and arguments unchanged.
In `@operators/advanced_network_media/common/frame_buffer.cpp`:
- Around line 57-91: In VideoBufferFrameBuffer::validate_format_compliance
replace the fallback branch that currently checks format_ !=
GXF_VIDEO_FORMAT_CUSTOM with an explicit check that format_ != expected_format
(so the code fails when the actual format doesn't equal expected_format),
keeping the NV12_709 and RGB special cases intact; likewise update
TensorFrameBuffer::validate_format_compliance to actually use the
expected_format parameter by validating format_ == expected_format before
performing tensor-shape checks, matching the equality-based pattern used in
AllocatedVideoBufferFrameBuffer and AllocatedTensorFrameBuffer.
In `@operators/advanced_network_media/common/rtp_params.h`:
- Around line 39-70: The parse() function currently dereferences rtp_hdr and
indexes using cc/offset without bounds checks; add a nullptr guard for rtp_hdr
and change the signature to accept a length (e.g., size_t rtp_len), validate the
base RTP header (minimum 12 bytes), compute cc and offset using
RTP_HEADER_CSRC_GRANULARITY_BYTES, then ensure rtp_len >= offset + 17 (to cover
reads up to rtp_hdr[offset + 16]) before accessing any bytes; if any check
fails, return false — update uses of sequence_number, f_bit, timestamp, m_bit,
and payload_size accordingly.
In
`@operators/advanced_network/advanced_network/managers/rivermax/rivermax_mgr_impl/rivermax_queue_configs.cpp`:
- Around line 299-323: In RivermaxCommonRxQueueValidator::validate, there's an
extra stray closing parenthesis immediately after the burst-pool thresholds
if-block that causes a compile error; remove the trailing ')' so the if-block
ends with its closing brace and the function compiles (ensure braces/parentheses
are balanced around the burst_pool_... validation and the surrounding return
statements).
🧹 Nitpick comments (4)
applications/adv_networking_media_sender/python/adv_networking_media_sender.py (1)
33-96: Consider usinglogging.exception()for error logs in exception handlers.When logging errors inside exception handlers,
logging.exception()automatically includes the traceback, which aids debugging. This applies to lines 88-89, 92-93, and similar patterns throughout the file.♻️ Example fix for one location
except Exception as e2: if require_rx: - logger.error("RX is not enabled. Please enable RX in the config file.") - logger.error(f"Could not find advanced_network_media_rx configuration: {e2}") + logger.exception("RX is not enabled. Please enable RX in the config file.") sys.exit(1) if require_tx: - logger.error("TX is not enabled. Please enable TX in the config file.") - logger.error(f"Could not find advanced_network_media_tx configuration: {e2}") + logger.exception("TX is not enabled. Please enable TX in the config file.") sys.exit(1)applications/adv_networking_media_player/python/adv_networking_media_player.py (1)
34-96: Consider extractingcheck_rx_tx_enabledto a shared module.This function is duplicated between
adv_networking_media_player.pyandadv_networking_media_sender.py(see relevant code snippets). Consider extracting it to a shared utility module to reduce code duplication and ensure consistent behavior.Additionally, within exception handlers (lines 88-94), consider using
logger.exception()instead oflogger.error()to automatically include stack traces, which aids debugging:except Exception as e2: if require_rx: - logger.error("RX is not enabled. Please enable RX in the config file.") - logger.error(f"Could not find advanced_network_media_rx configuration: {e2}") + logger.exception("RX is not enabled. Please enable RX in the config file.") sys.exit(1)operators/advanced_network_media/advanced_network_media_rx/adv_network_media_rx.h (1)
21-22: Include<memory>(and<cstdint>) explicitly forstd::unique_ptr/uint16_t.This avoids reliance on transitive includes and keeps the header self-sufficient.
♻️ Suggested include update
-#include <holoscan/holoscan.hpp> +#include <cstdint> +#include <memory> +#include <holoscan/holoscan.hpp>operators/advanced_network_media/advanced_network_media_rx/memory_copy_strategies.cpp (1)
176-203: Pointer arithmetic could overflow with buffer wraparound.At Line 180,
size_t pointer_diff = curr_ptr - prev_ptrassumescurr_ptr >= prev_ptr. If the buffer wraps around (detected later indetect_buffer_wraparound), this subtraction produces an unexpectedly large value due to unsigned underflow. The code handles this by detecting wraparound separately, but theis_exactly_contiguouscheck at Line 188 will fail silently with wrong values rather than explicitly handling the wraparound case.Consider using signed arithmetic or checking for wraparound before computing the diff:
Suggested improvement
for (size_t i = 1; i < collected_payloads_.size(); ++i) { uint8_t* prev_ptr = collected_payloads_[i - 1]; uint8_t* curr_ptr = collected_payloads_[i]; - size_t pointer_diff = curr_ptr - prev_ptr; + // Use signed arithmetic to detect backward jumps + ptrdiff_t signed_diff = curr_ptr - prev_ptr; + if (signed_diff <= 0) { + // Wraparound or backward pointer - not contiguous + is_exactly_contiguous = false; + is_stride_consistent = false; + break; + } + size_t pointer_diff = static_cast<size_t>(signed_diff);
| void compose() override { | ||
| using namespace holoscan; | ||
|
|
||
| auto adv_net_config = from_config("advanced_network").as<NetworkConfig>(); | ||
| if (ano::adv_net_init(adv_net_config) != Status::SUCCESS) { | ||
| HOLOSCAN_LOG_ERROR("Failed to configure the Advanced Network manager"); | ||
| exit(1); | ||
| } | ||
| HOLOSCAN_LOG_INFO("Configured the Advanced Network manager"); | ||
|
|
||
| const auto [rx_en, tx_en] = ano::get_rx_tx_configs_enabled(config()); | ||
| const auto mgr_type = ano::get_manager_type(config()); | ||
|
|
||
| HOLOSCAN_LOG_INFO("Using Advanced Network manager {}", | ||
| ano::manager_type_to_string(mgr_type)); | ||
| if (!rx_en) { | ||
| HOLOSCAN_LOG_ERROR("Rx is not enabled. Please enable Rx in the config file."); | ||
| exit(1); | ||
| } | ||
|
|
||
| auto adv_net_media_rx = | ||
| make_operator<ops::AdvNetworkMediaRxOp>("advanced_network_media_rx", | ||
| from_config("advanced_network_media_rx"), | ||
| make_condition<BooleanCondition>("is_alive", true)); | ||
|
|
||
| const auto allocator = make_resource<holoscan::UnboundedAllocator>("allocator"); | ||
|
|
||
| if (from_config("media_player_config.visualize").as<bool>()) { | ||
| const auto cuda_stream_pool = | ||
| make_resource<holoscan::CudaStreamPool>("cuda_stream", 0, 0, 0, 1, 5); | ||
|
|
||
| auto visualizer = make_operator<ops::HolovizOp>("visualizer", | ||
| from_config("holoviz"), | ||
| Arg("cuda_stream_pool", cuda_stream_pool), | ||
| Arg("allocator") = allocator); | ||
| add_flow(adv_net_media_rx, visualizer, {{"out_video_buffer", "receivers"}}); | ||
| } else if (from_config("media_player_config.write_to_file").as<bool>()) { | ||
| auto frames_writer = | ||
| make_operator<ops::FramesWriterOp>("frames_writer", from_config("frames_writer")); | ||
| add_flow(adv_net_media_rx, frames_writer); | ||
| } else { | ||
| HOLOSCAN_LOG_ERROR("At least one output type (write_to_file/visualize) must be defined"); | ||
| exit(1); | ||
| } |
There was a problem hiding this comment.
Replace exit(1) calls with exceptions for proper error handling.
The compose() method uses exit(1) on configuration errors (lines 274, 285, 310), which bypasses proper resource cleanup and destructor calls. The TX operator was updated to throw exceptions instead. Consider throwing std::runtime_error to allow the framework to handle cleanup gracefully.
🛠️ Proposed fix
auto adv_net_config = from_config("advanced_network").as<NetworkConfig>();
if (ano::adv_net_init(adv_net_config) != Status::SUCCESS) {
HOLOSCAN_LOG_ERROR("Failed to configure the Advanced Network manager");
- exit(1);
+ throw std::runtime_error("Failed to configure the Advanced Network manager");
}
HOLOSCAN_LOG_INFO("Configured the Advanced Network manager");
const auto [rx_en, tx_en] = ano::get_rx_tx_configs_enabled(config());
const auto mgr_type = ano::get_manager_type(config());
HOLOSCAN_LOG_INFO("Using Advanced Network manager {}",
ano::manager_type_to_string(mgr_type));
if (!rx_en) {
HOLOSCAN_LOG_ERROR("Rx is not enabled. Please enable Rx in the config file.");
- exit(1);
+ throw std::runtime_error("Rx is not enabled. Please enable Rx in the config file.");
}
// ... later ...
} else {
HOLOSCAN_LOG_ERROR("At least one output type (write_to_file/visualize) must be defined");
- exit(1);
+ throw std::runtime_error("At least one output type (write_to_file/visualize) must be defined");
}🤖 Prompt for AI Agents
In `@applications/adv_networking_media_player/cpp/adv_networking_media_player.cpp`
around lines 268 - 311, In compose(), replace the three exit(1) calls with
thrown std::runtime_error so destructors and framework cleanup can run: when
ano::adv_net_init(adv_net_config) fails (currently logs "Failed to configure the
Advanced Network manager"), throw std::runtime_error with that message and any
returned error details; when rx_en is false (currently logs "Rx is not
enabled..."), throw std::runtime_error with that message; and when neither
visualize nor write_to_file is enabled (currently logs "At least one output
type..."), throw std::runtime_error with that message—update compose(), and keep
the existing HOLOSCAN_LOG_* calls but remove exit(1) invocations so callers can
catch the exceptions.
| # Initial size below is set to 8 MB which is sufficient for | ||
| # a 1920 * 1080 RGBA image (uint8_t). | ||
| rmm_allocator: | ||
| device_memory_initial_size: "1024 MB" | ||
| device_memory_max_size: "1024 MB" | ||
| host_memory_initial_size: "1024 MB" | ||
| host_memory_max_size: "1024 MB" | ||
| dev_id: 0 |
There was a problem hiding this comment.
Misleading comment about memory allocation size.
The comment states "8 MB which is sufficient for a 1920 * 1080 RGBA image" but the actual allocation is 1024 MB. This comment appears to be a leftover from a different configuration and could confuse users.
📝 Proposed fix
-# Initial size below is set to 8 MB which is sufficient for
-# a 1920 * 1080 RGBA image (uint8_t).
+# Memory allocator configuration for video frame processing.
+# Sizes are set to 1024 MB to accommodate multiple frames in flight.
rmm_allocator:
device_memory_initial_size: "1024 MB"📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| # Initial size below is set to 8 MB which is sufficient for | |
| # a 1920 * 1080 RGBA image (uint8_t). | |
| rmm_allocator: | |
| device_memory_initial_size: "1024 MB" | |
| device_memory_max_size: "1024 MB" | |
| host_memory_initial_size: "1024 MB" | |
| host_memory_max_size: "1024 MB" | |
| dev_id: 0 | |
| # Memory allocator configuration for video frame processing. | |
| # Sizes are set to 1024 MB to accommodate multiple frames in flight. | |
| rmm_allocator: | |
| device_memory_initial_size: "1024 MB" | |
| device_memory_max_size: "1024 MB" | |
| host_memory_initial_size: "1024 MB" | |
| host_memory_max_size: "1024 MB" | |
| dev_id: 0 |
🤖 Prompt for AI Agents
In `@applications/adv_networking_media_sender/adv_networking_media_sender.yaml`
around lines 33 - 40, The comment above the rmm_allocator block is misleading
(mentions "8 MB" and a 1920×1080 RGBA image) while the actual allocations
(device_memory_initial_size, device_memory_max_size, host_memory_initial_size,
host_memory_max_size) are set to "1024 MB"; update or remove that comment so it
accurately reflects the configured sizes—either change the comment to state
"Initial size set to 1024 MB" (or explain why 1024 MB is chosen) or delete the
outdated line to avoid confusion when inspecting rmm_allocator and its
device/host memory size fields.
| * See the License for the specific language governing permissions and | ||
| * limitations under the License. | ||
| */ | ||
| #include <assert.h> |
There was a problem hiding this comment.
Remove unused <assert.h> include.
This include is unused (no assert() calls in the file) and causes a static analysis error. Additionally, prefer <cassert> over <assert.h> in C++ code if assertions are needed in the future.
🔧 Proposed fix
-#include <assert.h>📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| #include <assert.h> |
🧰 Tools
🪛 Clang (14.0.6)
[error] 17-17: 'assert.h' file not found
(clang-diagnostic-error)
🤖 Prompt for AI Agents
In `@applications/adv_networking_media_sender/cpp/adv_networking_media_sender.cpp`
at line 17, Remove the unused C-style include by deleting the line that includes
<assert.h> from adv_networking_media_sender.cpp; if you later need assertions,
replace it with the C++ header <cassert> instead of <assert.h> so static
analysis no longer reports the unused include.
| AdvNetworkMediaRxOp::AdvNetworkMediaRxOp() : pimpl_(nullptr) {} | ||
|
|
||
| AdvNetworkMediaRxOp::~AdvNetworkMediaRxOp() { | ||
| if (pimpl_) { | ||
| pimpl_->cleanup(); // Clean up allocated resources | ||
| delete pimpl_; | ||
| pimpl_ = nullptr; | ||
| } | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
cat operators/advanced_network_media/advanced_network_media_rx/adv_network_media_rx.h | grep -A 5 -B 5 "pimpl_"Repository: nvidia-holoscan/holohub
Length of output: 318
🏁 Script executed:
rg -n "pimpl_" operators/advanced_network_media/advanced_network_media_rx/ -t hpp -t hRepository: nvidia-holoscan/holohub
Length of output: 94
🏁 Script executed:
sed -n '703,711p' operators/advanced_network_media/advanced_network_media_rx/adv_network_media_rx.cppRepository: nvidia-holoscan/holohub
Length of output: 296
Remove manual delete from destructor—pimpl_ is a std::unique_ptr and will auto-delete.
The header declares pimpl_ as std::unique_ptr<AdvNetworkMediaRxOpImpl>, but the destructor manually calls delete pimpl_, which causes a double-delete bug. The unique_ptr will automatically delete the managed object when it goes out of scope.
Either make the destructor empty to let unique_ptr handle cleanup, or call pimpl_.reset() after cleanup() if explicit cleanup ordering is required.
🤖 Prompt for AI Agents
In
`@operators/advanced_network_media/advanced_network_media_rx/adv_network_media_rx.cpp`
around lines 703 - 711, The destructor of AdvNetworkMediaRxOp manually deletes
pimpl_ which is a std::unique_ptr, causing a double-delete; locate
AdvNetworkMediaRxOp::~AdvNetworkMediaRxOp() and remove the explicit delete of
pimpl_, leaving either an empty destructor (so unique_ptr auto-deletes) or, if
you need to call pimpl_->cleanup() first, call pimpl_->cleanup() then
pimpl_.reset() (do not use delete pimpl_).
| PyAdvNetworkMediaRxOp(Fragment* fragment, const py::args& args, | ||
| const std::string& interface_name = "", | ||
| uint16_t queue_id = default_queue_id, uint32_t frame_width = 1920, | ||
| uint32_t frame_height = 1080, uint32_t bit_depth = 8, | ||
| const std::string& video_format = "RGB888", bool hds = true, | ||
| const std::string& output_format = "video_buffer", | ||
| const std::string& memory_location = "device", | ||
| const std::string& name = "advanced_network_media_rx") { | ||
| add_positional_condition_and_resource_args(this, args); | ||
| name_ = name; | ||
| fragment_ = fragment; | ||
| spec_ = std::make_shared<OperatorSpec>(fragment); | ||
| setup(*spec_.get()); | ||
|
|
||
| // Set parameters if provided | ||
| if (!interface_name.empty()) { this->add_arg(Arg("interface_name", interface_name)); } | ||
| this->add_arg(Arg("queue_id", queue_id)); |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
fd -t f "adv_network_media_rx_pybind.cpp" --exec wc -l {} \;Repository: nvidia-holoscan/holohub
Length of output: 170
🏁 Script executed:
fd -t f "adv_network_media_rx_pybind.cpp" -x cat -n {}Repository: nvidia-holoscan/holohub
Length of output: 6509
🏁 Script executed:
fd -t f "*advanced_network_media*" -path "*/operators/*" | head -20Repository: nvidia-holoscan/holohub
Length of output: 235
🏁 Script executed:
fd "adv_network_media_rx.h" -x cat -n {}Repository: nvidia-holoscan/holohub
Length of output: 2965
🏁 Script executed:
fd "adv_network_media_rx.cpp" -x wc -l {}Repository: nvidia-holoscan/holohub
Length of output: 156
🏁 Script executed:
fd "adv_network_media_rx.cpp" -x cat -n {}Repository: nvidia-holoscan/holohub
Length of output: 35876
Make interface_name required or always set the parameter to avoid MetaParameter initialization error.
The parameter is declared without a default in setup() (line 717-720), making it required per SDK semantics. However, the Python binding defaults to "" and only adds the Arg conditionally (line 79), leaving the required parameter unset when using the default value, which triggers a MetaParameter error during operator initialization.
Align the binding with the required parameter semantics by either removing the default from the Python binding signature or always calling add_arg() regardless of emptiness.
🤖 Prompt for AI Agents
In
`@operators/advanced_network_media/advanced_network_media_rx/python/adv_network_media_rx_pybind.cpp`
around lines 64 - 80, The binding's constructor PyAdvNetworkMediaRxOp leaves the
required parameter declared in setup() ("interface_name") unset when defaulting
to "" and only conditionally calling add_arg; fix by either making
interface_name required in the Python signature (remove the "" default from
PyAdvNetworkMediaRxOp) or always add the argument to the spec regardless of
emptiness (call this->add_arg(Arg("interface_name", interface_name))
unconditionally) so the OperatorSpec created in setup() receives the required
MetaParameter; update references in PyAdvNetworkMediaRxOp, setup(), and the
conditional add_arg logic accordingly.
| AdvNetworkMediaTxOp::AdvNetworkMediaTxOp() : pimpl_(nullptr) { | ||
| } | ||
|
|
||
| AdvNetworkMediaTxOp::~AdvNetworkMediaTxOp() { | ||
| if (pimpl_) { | ||
| delete pimpl_; | ||
| pimpl_ = nullptr; | ||
| } | ||
| } |
There was a problem hiding this comment.
Double-free: manual delete on std::unique_ptr member.
The header declares pimpl_ as std::unique_ptr<AdvNetworkMediaTxOpImpl>, but the destructor manually calls delete pimpl_. This causes undefined behavior (double-free) because unique_ptr already deletes its managed object on destruction.
🐛 Proposed fix
AdvNetworkMediaTxOp::~AdvNetworkMediaTxOp() {
- if (pimpl_) {
- delete pimpl_;
- pimpl_ = nullptr;
- }
+ // unique_ptr automatically cleans up pimpl_
}Or simply use the default destructor:
-AdvNetworkMediaTxOp::~AdvNetworkMediaTxOp() {
- if (pimpl_) {
- delete pimpl_;
- pimpl_ = nullptr;
- }
-}
+AdvNetworkMediaTxOp::~AdvNetworkMediaTxOp() = default;🤖 Prompt for AI Agents
In
`@operators/advanced_network_media/advanced_network_media_tx/adv_network_media_tx.cpp`
around lines 276 - 284, The destructor AdvNetworkMediaTxOp::~AdvNetworkMediaTxOp
currently manually deletes pimpl_, but pimpl_ is a
std::unique_ptr<AdvNetworkMediaTxOpImpl>, causing a double-delete; remove the
manual delete and nullptr assignment and either default the destructor or leave
it empty so the unique_ptr cleans up automatically; update the
AdvNetworkMediaTxOp::~AdvNetworkMediaTxOp implementation (or use = default) and
ensure pimpl_ is only managed via the unique_ptr member.
| #define CUDA_TRY(stmt) \ | ||
| { \ | ||
| cudaError_t cuda_status = stmt; \ | ||
| if (cudaSuccess != cuda_status) { \ | ||
| ANM_LOG_ERROR("Runtime call {} in line {} of file {} failed with '{}' ({})", \ | ||
| #stmt, \ | ||
| __LINE__, \ | ||
| __FILE__, \ | ||
| cudaGetErrorString(cuda_status), \ | ||
| static_cast<int>(cuda_status)); \ | ||
| throw std::runtime_error("CUDA operation failed"); \ | ||
| } \ | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# Search for CUDA_TRY usages in the codebase
rg "CUDA_TRY\s*\(" --context 3 -t cpp -t hppRepository: nvidia-holoscan/holohub
Length of output: 94
🏁 Script executed:
# Check if there's a definition or macro guard in the header file
cat -n operators/advanced_network_media/common/adv_network_media_common.hRepository: nvidia-holoscan/holohub
Length of output: 2508
🏁 Script executed:
# Search for any other CUDA error handling macros or patterns in the repo
rg "do\s*\{\s*while\s*\(\s*0\s*\)" -t cpp -t hpp --context 2 | head -50Repository: nvidia-holoscan/holohub
Length of output: 94
🏁 Script executed:
# Search for CUDA_TRY usages without file type restriction
rg "CUDA_TRY\s*\(" --context 4Repository: nvidia-holoscan/holohub
Length of output: 50379
🏁 Script executed:
# Look for common macro safety patterns in the codebase
rg "do\s*\{.*while\s*\(\s*0\s*\)" --multiline-dotall | head -30Repository: nvidia-holoscan/holohub
Length of output: 159
🏁 Script executed:
# Check if there are any if/else statements using CUDA_TRY that might be affected
rg "(if|else)\s*(.*CUDA_TRY|CUDA_TRY.*)" --context 3Repository: nvidia-holoscan/holohub
Length of output: 1990
🏁 Script executed:
# Check if there are any problematic if/else patterns with CUDA_TRY in adv_network_media files
find operators/advanced_network_media -name "*.cpp" -o -name "*.cu" | head -20Repository: nvidia-holoscan/holohub
Length of output: 895
🏁 Script executed:
# Look at actual CUDA_TRY usage in a file that uses it
rg "CUDA_TRY" operators/advanced_network_media -A 2 -B 2Repository: nvidia-holoscan/holohub
Length of output: 4299
Wrap CUDA_TRY in do { ... } while (0) for safe macro expansion.
The current bare-braces form can break if/else patterns when followed by a semicolon (which is natural when macros are treated as statements). The do/while(0) wrapper is the standard C/C++ idiom to prevent this dangling-else issue.
🔧 Suggested fix
-#define CUDA_TRY(stmt) \
- { \
- cudaError_t cuda_status = stmt; \
- if (cudaSuccess != cuda_status) { \
- ANM_LOG_ERROR("Runtime call {} in line {} of file {} failed with '{}' ({})", \
- `#stmt`, \
- __LINE__, \
- __FILE__, \
- cudaGetErrorString(cuda_status), \
- static_cast<int>(cuda_status)); \
- throw std::runtime_error("CUDA operation failed"); \
- } \
- }
+#define CUDA_TRY(stmt) \
+ do { \
+ cudaError_t cuda_status = stmt; \
+ if (cudaSuccess != cuda_status) { \
+ ANM_LOG_ERROR("Runtime call {} in line {} of file {} failed with '{}' ({})", \
+ `#stmt`, \
+ __LINE__, \
+ __FILE__, \
+ cudaGetErrorString(cuda_status), \
+ static_cast<int>(cuda_status)); \
+ throw std::runtime_error("CUDA operation failed"); \
+ } \
+ } while (0)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| #define CUDA_TRY(stmt) \ | |
| { \ | |
| cudaError_t cuda_status = stmt; \ | |
| if (cudaSuccess != cuda_status) { \ | |
| ANM_LOG_ERROR("Runtime call {} in line {} of file {} failed with '{}' ({})", \ | |
| #stmt, \ | |
| __LINE__, \ | |
| __FILE__, \ | |
| cudaGetErrorString(cuda_status), \ | |
| static_cast<int>(cuda_status)); \ | |
| throw std::runtime_error("CUDA operation failed"); \ | |
| } \ | |
| } | |
| `#define` CUDA_TRY(stmt) \ | |
| do { \ | |
| cudaError_t cuda_status = stmt; \ | |
| if (cudaSuccess != cuda_status) { \ | |
| ANM_LOG_ERROR("Runtime call {} in line {} of file {} failed with '{}' ({})", \ | |
| `#stmt`, \ | |
| __LINE__, \ | |
| __FILE__, \ | |
| cudaGetErrorString(cuda_status), \ | |
| static_cast<int>(cuda_status)); \ | |
| throw std::runtime_error("CUDA operation failed"); \ | |
| } \ | |
| } while (0) |
🤖 Prompt for AI Agents
In `@operators/advanced_network_media/common/adv_network_media_common.h` around
lines 27 - 39, The CUDA_TRY macro is defined with bare braces which can cause
dangling-else / statement grouping issues; wrap the macro body in the standard
do { ... } while (0) idiom so it behaves like a single statement in all
contexts. Modify the CUDA_TRY definition (keeping its internals: cudaError_t
cuda_status = stmt; the ANM_LOG_ERROR call with cudaGetErrorString(cuda_status)
and the throw std::runtime_error) to be enclosed in do { /* existing code */ }
while (0) and leave the macro name and arguments unchanged.
| Status VideoBufferFrameBuffer::validate_format_compliance( | ||
| nvidia::gxf::VideoFormat expected_format) const { | ||
| if (expected_format == nvidia::gxf::VideoFormat::GXF_VIDEO_FORMAT_NV12_709) { | ||
| if (format_ != nvidia::gxf::VideoFormat::GXF_VIDEO_FORMAT_NV12_709 && | ||
| format_ != nvidia::gxf::VideoFormat::GXF_VIDEO_FORMAT_YUV420_709) { | ||
| ANM_LOG_ERROR("Invalid NV12_709 format"); | ||
| return Status::INVALID_PARAMETER; | ||
| } | ||
| } else if (expected_format == nvidia::gxf::VideoFormat::GXF_VIDEO_FORMAT_RGB) { | ||
| if (format_ != nvidia::gxf::VideoFormat::GXF_VIDEO_FORMAT_RGB) { | ||
| ANM_LOG_ERROR("Invalid RGB format"); | ||
| return Status::INVALID_PARAMETER; | ||
| } | ||
| } else if (format_ != nvidia::gxf::VideoFormat::GXF_VIDEO_FORMAT_CUSTOM) { | ||
| ANM_LOG_ERROR( | ||
| "Format mismatch: {} vs {}", static_cast<int>(format_), static_cast<int>(expected_format)); | ||
| return Status::INVALID_PARAMETER; | ||
| } | ||
|
|
||
| if (expected_format == nvidia::gxf::VideoFormat::GXF_VIDEO_FORMAT_NV12_709) { | ||
| if (width_ % SMPTE_420_ALIGNMENT != 0 || height_ % SMPTE_420_ALIGNMENT != 0) { | ||
| ANM_LOG_ERROR("Resolution not 4:2:0 aligned"); | ||
| return Status::INVALID_PARAMETER; | ||
| } | ||
| } | ||
|
|
||
| for (const auto& plane : planes_) { | ||
| if (plane.stride % SMPTE_STRIDE_ALIGNMENT != 0) { | ||
| ANM_LOG_ERROR("Stride {} not {}-byte aligned", plane.stride, SMPTE_STRIDE_ALIGNMENT); | ||
| return Status::INVALID_PARAMETER; | ||
| } | ||
| } | ||
|
|
||
| return Status::SUCCESS; | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# Find all video format usage in the codebase
rg -n 'GXF_VIDEO_FORMAT_' operators/advanced_network_media/ --type cpp --type hpp | head -50Repository: nvidia-holoscan/holohub
Length of output: 94
🏁 Script executed:
# Check the full frame_buffer.cpp file to see the complete validation logic
cd operators/advanced_network_media/common && wc -l frame_buffer.cppRepository: nvidia-holoscan/holohub
Length of output: 87
🏁 Script executed:
# Read the VideoBufferFrameBuffer and TensorFrameBuffer validation methods
cd operators/advanced_network_media/common && sed -n '57,140p' frame_buffer.cppRepository: nvidia-holoscan/holohub
Length of output: 3229
🏁 Script executed:
# Search for all expected_format values being passed to validate_format_compliance
rg -n 'validate_format_compliance' operators/advanced_network_media/ --type cpp --type hpp -A 3Repository: nvidia-holoscan/holohub
Length of output: 94
🏁 Script executed:
# Search for calls to validate_format_compliance with different approaches
rg -n 'validate_format_compliance' operators/advanced_network_media/ --type cpp -B 2 -A 3Repository: nvidia-holoscan/holohub
Length of output: 5392
🏁 Script executed:
# Look for where expected_format is determined (callers of validate_format_compliance)
rg -n 'VideoBufferFrameBuffer\|TensorFrameBuffer' operators/advanced_network_media/ --type cpp | grep -E '(new|make_unique|instantiate)' | head -20Repository: nvidia-holoscan/holohub
Length of output: 49
🏁 Script executed:
# Check what formats are passed to the operators
cd operators/advanced_network_media && find . -name "*.yaml" -o -name "*.cpp" | xargs grep -l "expected_format\|validate_format" | head -10Repository: nvidia-holoscan/holohub
Length of output: 92
🏁 Script executed:
# Look at the actual implementation of methods that call validate_format_compliance
cd operators/advanced_network_media/common && sed -n '1,56p' frame_buffer.cppRepository: nvidia-holoscan/holohub
Length of output: 2131
🏁 Script executed:
# Check the AllocatedVideoBufferFrameBuffer and AllocatedTensorFrameBuffer validate methods for comparison
cd operators/advanced_network_media/common && sed -n '147,155p' frame_buffer.cpp && echo "---" && sed -n '214,222p' frame_buffer.cppRepository: nvidia-holoscan/holohub
Length of output: 776
🏁 Script executed:
# Look for where these validate methods are called to understand the usage pattern
rg -n 'validate_frame_parameters\|validate_format_compliance' operators/advanced_network_media/ --type cpp -B 1 -A 2 | grep -A 3 'validate_'Repository: nvidia-holoscan/holohub
Length of output: 49
🏁 Script executed:
# Check if there are any tests or examples showing expected behavior
fd -e cpp -e h . operators/advanced_network_media | xargs grep -l 'validate_format_compliance' | head -5Repository: nvidia-holoscan/holohub
Length of output: 178
🏁 Script executed:
# Let me see the full TensorFrameBuffer::validate_format_compliance to understand if it uses expected_format
cd operators/advanced_network_media/common && sed -n '109,135p' frame_buffer.cppRepository: nvidia-holoscan/holohub
Length of output: 1005
🏁 Script executed:
# Check if there's any documentation or comments explaining the CUSTOM format special handling
cd operators/advanced_network_media/common && grep -n "CUSTOM\|GXF_VIDEO_FORMAT" frame_buffer.h | head -20Repository: nvidia-holoscan/holohub
Length of output: 49
🏁 Script executed:
# Check what the complete AllocatedTensorFrameBuffer validation looks like
cd operators/advanced_network_media/common && sed -n '214,235p' frame_buffer.cppRepository: nvidia-holoscan/holohub
Length of output: 963
🏁 Script executed:
# Search for how these validate methods are being called in the operators
fd -e cpp . operators/advanced_network_media | xargs grep -n 'validate_frame_parameters\|validate_format_compliance' | grep -v frame_bufferRepository: nvidia-holoscan/holohub
Length of output: 403
🏁 Script executed:
# Look at how validate_frame_parameters is being called to understand what expected_format values are passed
cd operators/advanced_network_media/advanced_network_media_tx && sed -n '90,130p' adv_network_media_tx.cppRepository: nvidia-holoscan/holohub
Length of output: 1709
🏁 Script executed:
# Get the full picture of VideoBufferFrameBuffer::validate_format_compliance including context
cd operators/advanced_network_media/common && sed -n '57,95p' frame_buffer.cppRepository: nvidia-holoscan/holohub
Length of output: 1645
🏁 Script executed:
# Check what expected_video_format_ values are set in the operator
cd operators/advanced_network_media && rg -n 'expected_video_format_' --type cpp -B 2 -A 2 | head -40Repository: nvidia-holoscan/holohub
Length of output: 2368
🏁 Script executed:
# Look at what video formats are actually used/expected in config files
fd -type f \( -name "*.yaml" -o -name "*.json" \) . operators/advanced_network_media | xargs grep -l 'format\|Format' | head -5Repository: nvidia-holoscan/holohub
Length of output: 237
🏁 Script executed:
# Check the get_expected_gxf_video_format function to see what formats are returned
cd operators/advanced_network_media && rg -n 'get_expected_gxf_video_format' --type cpp -B 2 -A 10 | head -60Repository: nvidia-holoscan/holohub
Length of output: 4588
🏁 Script executed:
# Search for yaml files
find operators/advanced_network_media -name "*.yaml" | xargs grep -l 'format\|Format' 2>/dev/null | head -5Repository: nvidia-holoscan/holohub
Length of output: 49
🏁 Script executed:
# Check what VideoFormatSampling values can produce
cd operators/advanced_network_media && rg -n 'VideoFormatSampling\|get_expected_gxf_video_format' --type cpp | head -20Repository: nvidia-holoscan/holohub
Length of output: 49
🏁 Script executed:
# Let me trace through the logic one more time - check what get_expected_gxf_video_format actually returns
cd operators/advanced_network_media/common && sed -n '67,78p' video_parameters.cppRepository: nvidia-holoscan/holohub
Length of output: 777
🏁 Script executed:
# Check if there are any other places where VideoBufferFrameBuffer is used with different expected_format values
rg -n 'validate_format_compliance\|validate_frame_parameters' operators/advanced_network_media --type cpp -B 3 | grep -E '(validate|expected_video_format_|expected_format)'Repository: nvidia-holoscan/holohub
Length of output: 49
🏁 Script executed:
# Let me verify if TensorFrameBuffer is intentionally not checking expected_format
cd operators/advanced_network_media/common && sed -n '104,145p' frame_buffer.cppRepository: nvidia-holoscan/holohub
Length of output: 1460
🏁 Script executed:
# Check if there's a reason why TensorFrameBuffer doesn't validate against expected_format
rg -n 'TensorFrameBuffer' operators/advanced_network_media --type cpp -B 2 -A 5 | head -40Repository: nvidia-holoscan/holohub
Length of output: 3930
🏁 Script executed:
# Check if there's a separate validation elsewhere for TensorFrameBuffer format matching
rg -n 'expected_video_format_' operators/advanced_network_media/advanced_network_media_tx --type cpp -B 1 -A 3 | grep -A 5 'TensorFrameBuffer\|create_media_frame_from_tensor'Repository: nvidia-holoscan/holohub
Length of output: 940
🏁 Script executed:
# Verify the exact line numbers and logic once more
cd operators/advanced_network_media/common && sed -n '57,75p' frame_buffer.cpp && echo "---TENSOR---" && sed -n '109,120p' frame_buffer.cppRepository: nvidia-holoscan/holohub
Length of output: 1477
Clarify and standardize format validation logic across FrameBuffer implementations.
The fallback condition at lines 70-74 should explicitly check format_ != expected_format instead of relying on an implicit assumption about CUSTOM format. This makes the logic consistent with AllocatedVideoBufferFrameBuffer and AllocatedTensorFrameBuffer, both of which use direct equality checks.
Additionally, TensorFrameBuffer::validate_format_compliance() accepts an expected_format parameter but never uses it. It should validate that format_ == expected_format before checking tensor shape, matching the pattern used in the Allocated* implementations.
🤖 Prompt for AI Agents
In `@operators/advanced_network_media/common/frame_buffer.cpp` around lines 57 -
91, In VideoBufferFrameBuffer::validate_format_compliance replace the fallback
branch that currently checks format_ != GXF_VIDEO_FORMAT_CUSTOM with an explicit
check that format_ != expected_format (so the code fails when the actual format
doesn't equal expected_format), keeping the NV12_709 and RGB special cases
intact; likewise update TensorFrameBuffer::validate_format_compliance to
actually use the expected_format parameter by validating format_ ==
expected_format before performing tensor-shape checks, matching the
equality-based pattern used in AllocatedVideoBufferFrameBuffer and
AllocatedTensorFrameBuffer.
| bool parse(const uint8_t* rtp_hdr) { | ||
| // Validate RTP version (must be version 2) | ||
| if ((rtp_hdr[0] & 0xC0) != 0x80) { | ||
| return false; | ||
| } | ||
|
|
||
| // Extract CSRC count and calculate offset | ||
| uint8_t cc = 0x0F & rtp_hdr[0]; | ||
| uint8_t offset = cc * RTP_HEADER_CSRC_GRANULARITY_BYTES; | ||
|
|
||
| // Extract sequence number (16-bit + 16-bit extended) | ||
| // Cast to uint32_t before shifting to avoid undefined behavior from integer promotion | ||
| sequence_number = static_cast<uint32_t>(rtp_hdr[3]) | (static_cast<uint32_t>(rtp_hdr[2]) << 8); | ||
| sequence_number |= (static_cast<uint32_t>(rtp_hdr[offset + 12]) << 24) | | ||
| (static_cast<uint32_t>(rtp_hdr[offset + 13]) << 16); | ||
|
|
||
| // Extract field bit | ||
| f_bit = !!(rtp_hdr[offset + 16] & 0x80); | ||
|
|
||
| // Extract timestamp (32-bit, network byte order = big-endian) | ||
| timestamp = (static_cast<uint32_t>(rtp_hdr[4]) << 24) | | ||
| (static_cast<uint32_t>(rtp_hdr[5]) << 16) | | ||
| (static_cast<uint32_t>(rtp_hdr[6]) << 8) | | ||
| static_cast<uint32_t>(rtp_hdr[7]); | ||
|
|
||
| // Extract marker bit | ||
| m_bit = !!(rtp_hdr[1] & 0x80); | ||
|
|
||
| // Extract payload size from SRD Length field (2 bytes, network byte order) | ||
| // Cast to uint16_t before shifting to avoid undefined behavior from integer promotion | ||
| payload_size = (static_cast<uint16_t>(rtp_hdr[offset + 14]) << 8) | | ||
| static_cast<uint16_t>(rtp_hdr[offset + 15]); |
There was a problem hiding this comment.
Add basic input validation for rtp_hdr (and length if available).
parse() assumes a non-null, sufficiently long buffer and reads offsets based on cc. A malformed packet can cause OOB reads. Consider adding a nullptr guard and a length parameter so you can validate rtp_len >= offset + 16 before accessing.
🛡️ Minimal guard (plus length parameter if you can)
-bool parse(const uint8_t* rtp_hdr) {
+bool parse(const uint8_t* rtp_hdr, size_t rtp_len) {
+ if (!rtp_hdr) { return false; }
// Validate RTP version (must be version 2)
if ((rtp_hdr[0] & 0xC0) != 0x80) {
return false;
}
// Extract CSRC count and calculate offset
uint8_t cc = 0x0F & rtp_hdr[0];
uint8_t offset = cc * RTP_HEADER_CSRC_GRANULARITY_BYTES;
+ if (rtp_len < static_cast<size_t>(offset) + 16) { return false; }🤖 Prompt for AI Agents
In `@operators/advanced_network_media/common/rtp_params.h` around lines 39 - 70,
The parse() function currently dereferences rtp_hdr and indexes using cc/offset
without bounds checks; add a nullptr guard for rtp_hdr and change the signature
to accept a length (e.g., size_t rtp_len), validate the base RTP header (minimum
12 bytes), compute cc and offset using RTP_HEADER_CSRC_GRANULARITY_BYTES, then
ensure rtp_len >= offset + 17 (to cover reads up to rtp_hdr[offset + 16]) before
accessing any bytes; if any check fails, return false — update uses of
sequence_number, f_bit, timestamp, m_bit, and payload_size accordingly.
0da0138 to
3c454dc
Compare
| size_t pointer_diff = curr_ptr - prev_ptr; | ||
|
|
There was a problem hiding this comment.
Unsigned pointer subtraction wraps on ring-buffer backward-pointer case
size_t pointer_diff = curr_ptr - prev_ptr stores the result of a ptrdiff_t (signed) subtraction into an unsigned size_t. If curr_ptr < prev_ptr (ring-buffer wraparound with a backward pointer diff ≤ 1 MB, which slips past detect_buffer_wraparound()), the value wraps to a value near SIZE_MAX. This would then become actual_stride, and a later StridedMemoryCopyStrategy could be constructed with a nonsensical stride that leads to out-of-bounds cudaMemcpy2D reads.
The same unsigned-wrap issue appears in StridedMemoryCopyStrategy::is_stride_maintained() at line ~509 (size_t actual_diff = payload - last_packet_ptr_).
Prefer ptrdiff_t for pointer differences and guard against negative values explicitly:
// analyze_pattern()
ptrdiff_t signed_diff = curr_ptr - prev_ptr;
if (signed_diff <= 0) {
// backward or zero pointer diff - treat as non-contiguous
is_exactly_contiguous = false;
is_stride_consistent = false;
break;
}
size_t pointer_diff = static_cast<size_t>(signed_diff);| return true; | ||
| } | ||
|
|
||
| size_t actual_diff = payload - last_packet_ptr_; |
There was a problem hiding this comment.
Unsigned pointer subtraction in stride validation
Same sign-wrapping issue as in analyze_pattern(): if payload < last_packet_ptr_ due to a ring-buffer wraparound with a backward diff ≤ 1 MB, actual_diff wraps to near SIZE_MAX. In the first-stride check path this is then stored as actual_stride_, which would later be passed as src_pitch to cudaMemcpy2D — a CUDA API call that would fail or access out-of-bounds GPU memory with a pitch that large.
// Use signed arithmetic to safely detect backward/wrapped pointers
ptrdiff_t signed_diff = payload - last_packet_ptr_;
if (signed_diff <= 0) {
return false; // Treat backward pointers as stride break
}
size_t actual_diff = static_cast<size_t>(signed_diff);There was a problem hiding this comment.
♻️ Duplicate comments (3)
applications/adv_networking_media_player/cpp/adv_networking_media_player.cpp (2)
33-45:⚠️ Potential issue | 🟡 MinorWrap CUDA_TRY macro in
do { ... } while (0).The macro body uses bare braces which can cause unexpected behavior when used after
ifwithout braces. This is the standard C/C++ idiom to make macros behave like single statements.Proposed fix
`#define` CUDA_TRY(stmt) \ - { \ + do { \ cudaError_t cuda_status = stmt; \ if (cudaSuccess != cuda_status) { \ HOLOSCAN_LOG_ERROR("Runtime call {} in line {} of file {} failed with '{}' ({})", \ `#stmt`, \ __LINE__, \ __FILE__, \ cudaGetErrorString(cuda_status), \ static_cast<int>(cuda_status)); \ throw std::runtime_error("CUDA operation failed"); \ } \ - } + } while (0)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@applications/adv_networking_media_player/cpp/adv_networking_media_player.cpp` around lines 33 - 45, The CUDA_TRY macro currently uses bare braces which can break when used in control statements; change its definition so the entire body is wrapped in the do { ... } while (0) idiom to make CUDA_TRY behave like a single statement. Update the macro named CUDA_TRY (the whole block starting with `#define` CUDA_TRY(stmt)) so that the existing cudaError_t cuda_status assignment, the if check, HOLOSCAN_LOG_ERROR call, and the throw are enclosed inside a do { ... } while (0) wrapper.
268-311:⚠️ Potential issue | 🟠 MajorReplace
exit(1)calls with exceptions for proper resource cleanup.The
compose()method usesexit(1)on configuration errors (lines 274, 285, 310), which bypasses destructor calls and proper resource cleanup. Throwingstd::runtime_errorallows the framework to handle cleanup gracefully.Proposed fix
auto adv_net_config = from_config("advanced_network").as<NetworkConfig>(); if (ano::adv_net_init(adv_net_config) != Status::SUCCESS) { HOLOSCAN_LOG_ERROR("Failed to configure the Advanced Network manager"); - exit(1); + throw std::runtime_error("Failed to configure the Advanced Network manager"); } HOLOSCAN_LOG_INFO("Configured the Advanced Network manager"); const auto [rx_en, tx_en] = ano::get_rx_tx_configs_enabled(config()); const auto mgr_type = ano::get_manager_type(config()); HOLOSCAN_LOG_INFO("Using Advanced Network manager {}", ano::manager_type_to_string(mgr_type)); if (!rx_en) { HOLOSCAN_LOG_ERROR("Rx is not enabled. Please enable Rx in the config file."); - exit(1); + throw std::runtime_error("Rx is not enabled. Please enable Rx in the config file."); }} else { HOLOSCAN_LOG_ERROR("At least one output type (write_to_file/visualize) must be defined"); - exit(1); + throw std::runtime_error("At least one output type (write_to_file/visualize) must be defined"); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@applications/adv_networking_media_player/cpp/adv_networking_media_player.cpp` around lines 268 - 311, In compose(), replace the direct calls to exit(1) with throwing std::runtime_error carrying a descriptive message so RAII and destructors can run; specifically, where ano::adv_net_init(...) currently triggers exit(1), throw std::runtime_error("Failed to configure the Advanced Network manager"); where the rx_en check calls exit(1), throw std::runtime_error("Rx is not enabled. Please enable Rx in the config file."); and where the final output-type check calls exit(1), throw std::runtime_error("At least one output type (write_to_file/visualize) must be defined"); ensure <stdexcept> is included if not already.applications/adv_networking_media_sender/cpp/adv_networking_media_sender.cpp (1)
17-17:⚠️ Potential issue | 🟡 MinorRemove unused
<assert.h>include.This header is not used anywhere in the file (no
assert()calls) and causes a static analysis error. If assertions are needed in the future, prefer<cassert>over<assert.h>in C++ code.🔧 Proposed fix
-#include <assert.h>🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@applications/adv_networking_media_sender/cpp/adv_networking_media_sender.cpp` at line 17, Remove the unused C header include by deleting the line that includes <assert.h> from adv_networking_media_sender.cpp; if you later need assertions in this C++ source prefer adding <cassert> and use assert() from the std namespace as needed.
🧹 Nitpick comments (2)
applications/adv_networking_media_player/cpp/adv_networking_media_player.cpp (1)
304-307: Consider explicit port mapping for consistency.The
write_to_filebranch uses implicit port mapping while thevisualizebranch explicitly maps{"out_video_buffer", "receivers"}. For clarity and maintainability, consider making the port mapping explicit here as well.Suggested change
} else if (from_config("media_player_config.write_to_file").as<bool>()) { auto frames_writer = make_operator<ops::FramesWriterOp>("frames_writer", from_config("frames_writer")); - add_flow(adv_net_media_rx, frames_writer); + add_flow(adv_net_media_rx, frames_writer, {{"out_video_buffer", "input"}}); } else {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@applications/adv_networking_media_player/cpp/adv_networking_media_player.cpp` around lines 304 - 307, The write_to_file branch currently adds frames_writer with implicit port mapping; make it explicit to match the visualize branch by passing the same port mapping when wiring adv_net_media_rx to the frames_writer operator (reference adv_net_media_rx, frames_writer, make_operator<ops::FramesWriterOp>, add_flow and from_config("media_player_config.write_to_file").as<bool()); update the add_flow call to include the mapping for "out_video_buffer" -> "receivers" so connections are explicit and consistent.applications/adv_networking_media_sender/python/adv_networking_media_sender.py (1)
33-96: Consider extracting sharedcheck_rx_tx_enabled()to a common module.This function is duplicated verbatim from
adv_networking_media_player.py. Consider extracting it to a shared utility module under theholohub.advanced_network_commonpackage to avoid code duplication across applications.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@applications/adv_networking_media_sender/python/adv_networking_media_sender.py` around lines 33 - 96, The check_rx_tx_enabled function is duplicated (in adv_networking_media_sender.py and adv_networking_media_player.py); extract it into a shared utility module (e.g., create holohub.advanced_network_common with a function check_rx_tx_enabled) and replace the local definitions in both files with an import from that module; ensure the new shared function preserves the same signature and behavior (app, require_rx=True, require_tx=False), update imports in both modules to import check_rx_tx_enabled, and remove the duplicated function bodies so both callers reference the single shared implementation.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In
`@applications/adv_networking_media_player/cpp/adv_networking_media_player.cpp`:
- Around line 33-45: The CUDA_TRY macro currently uses bare braces which can
break when used in control statements; change its definition so the entire body
is wrapped in the do { ... } while (0) idiom to make CUDA_TRY behave like a
single statement. Update the macro named CUDA_TRY (the whole block starting with
`#define` CUDA_TRY(stmt)) so that the existing cudaError_t cuda_status assignment,
the if check, HOLOSCAN_LOG_ERROR call, and the throw are enclosed inside a do {
... } while (0) wrapper.
- Around line 268-311: In compose(), replace the direct calls to exit(1) with
throwing std::runtime_error carrying a descriptive message so RAII and
destructors can run; specifically, where ano::adv_net_init(...) currently
triggers exit(1), throw std::runtime_error("Failed to configure the Advanced
Network manager"); where the rx_en check calls exit(1), throw
std::runtime_error("Rx is not enabled. Please enable Rx in the config file.");
and where the final output-type check calls exit(1), throw
std::runtime_error("At least one output type (write_to_file/visualize) must be
defined"); ensure <stdexcept> is included if not already.
In
`@applications/adv_networking_media_sender/cpp/adv_networking_media_sender.cpp`:
- Line 17: Remove the unused C header include by deleting the line that includes
<assert.h> from adv_networking_media_sender.cpp; if you later need assertions in
this C++ source prefer adding <cassert> and use assert() from the std namespace
as needed.
---
Nitpick comments:
In
`@applications/adv_networking_media_player/cpp/adv_networking_media_player.cpp`:
- Around line 304-307: The write_to_file branch currently adds frames_writer
with implicit port mapping; make it explicit to match the visualize branch by
passing the same port mapping when wiring adv_net_media_rx to the frames_writer
operator (reference adv_net_media_rx, frames_writer,
make_operator<ops::FramesWriterOp>, add_flow and
from_config("media_player_config.write_to_file").as<bool()); update the add_flow
call to include the mapping for "out_video_buffer" -> "receivers" so connections
are explicit and consistent.
In
`@applications/adv_networking_media_sender/python/adv_networking_media_sender.py`:
- Around line 33-96: The check_rx_tx_enabled function is duplicated (in
adv_networking_media_sender.py and adv_networking_media_player.py); extract it
into a shared utility module (e.g., create holohub.advanced_network_common with
a function check_rx_tx_enabled) and replace the local definitions in both files
with an import from that module; ensure the new shared function preserves the
same signature and behavior (app, require_rx=True, require_tx=False), update
imports in both modules to import check_rx_tx_enabled, and remove the duplicated
function bodies so both callers reference the single shared implementation.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 1f044f6f-da3c-46d7-97f7-7ad6e7a24c91
📒 Files selected for processing (19)
applications/CMakeLists.txtapplications/adv_networking_media_player/CMakeLists.txtapplications/adv_networking_media_player/README.mdapplications/adv_networking_media_player/adv_networking_media_player.yamlapplications/adv_networking_media_player/cpp/CMakeLists.txtapplications/adv_networking_media_player/cpp/adv_networking_media_player.cppapplications/adv_networking_media_player/cpp/metadata.jsonapplications/adv_networking_media_player/python/CMakeLists.txtapplications/adv_networking_media_player/python/adv_networking_media_player.pyapplications/adv_networking_media_player/python/metadata.jsonapplications/adv_networking_media_sender/CMakeLists.txtapplications/adv_networking_media_sender/README.mdapplications/adv_networking_media_sender/adv_networking_media_sender.yamlapplications/adv_networking_media_sender/cpp/CMakeLists.txtapplications/adv_networking_media_sender/cpp/adv_networking_media_sender.cppapplications/adv_networking_media_sender/cpp/metadata.jsonapplications/adv_networking_media_sender/python/CMakeLists.txtapplications/adv_networking_media_sender/python/adv_networking_media_sender.pyapplications/adv_networking_media_sender/python/metadata.json
✅ Files skipped from review due to trivial changes (12)
- applications/adv_networking_media_player/CMakeLists.txt
- applications/CMakeLists.txt
- applications/adv_networking_media_sender/python/metadata.json
- applications/adv_networking_media_player/cpp/metadata.json
- applications/adv_networking_media_sender/cpp/metadata.json
- applications/adv_networking_media_sender/python/CMakeLists.txt
- applications/adv_networking_media_player/adv_networking_media_player.yaml
- applications/adv_networking_media_sender/adv_networking_media_sender.yaml
- applications/adv_networking_media_player/cpp/CMakeLists.txt
- applications/adv_networking_media_sender/cpp/CMakeLists.txt
- applications/adv_networking_media_player/python/metadata.json
- applications/adv_networking_media_player/python/CMakeLists.txt
🚧 Files skipped from review as they are similar to previous changes (1)
- applications/adv_networking_media_sender/CMakeLists.txt
- Add AdvNetworkMediaRxOp for receiving media streams over Rivermax - Supports packet-to-frame conversion with ANO Burst processing - Configurable video formats, bit depths, and frame dimensions - Output as VideoBuffer or Tensor entities - Add AdvNetworkMediaTxOp for transmitting media streams over Rivermax - Processes VideoBuffer/Tensor inputs for network transmission - Configurable network interface and queue management - Support for various video formats Features: - SMPTE 2110 compliance for professional media over IP - GPU acceleration and GPUDirect support - Low latency optimizations - Python bindings for both operators These operators build upon the Advanced Network library to provide specialized functionality for broadcast and media streaming use cases requiring strict timing and high throughput performance. Signed-off-by: Rony Rado <rrado@nvidia.com>
- Add high-performance media streaming application built on Advanced Network Media Tx operator - Utilizes AdvNetworkMediaTxOp for media transmission over Rivermax - Integrates with Advanced Network Manager for optimized network resource management - SMPTE 2110 compliant for professional broadcast applications - Real-time transmission of media files with precise timing control - Support multiple video formats and resolutions - RGB888, YUV420, NV12 and other common video formats - Configurable bit depths (8, 10, 12, 16-bit) - Multiple resolution support up to 4K and beyond - GPU acceleration with GPUDirect for zero-copy operations * Implement dual language support - C++ implementation for maximum performance - Python implementation for ease of use and development * Support professional broadcast features - Frame-accurate timing for live streaming applications - Looping playback for continuous streaming - VideoStreamReplayer integration for file-based sources Pipeline: VideoStreamReplayer → AdvNetworkMediaTxOp → Advanced Network Manager → Network Interface This application demonstrates how to use the Advanced Network Media operators for professional-grade media transmission in broadcast and media production environments requiring ultra-low latency and high throughput performance. Signed-off-by: Rony Rado <rrado@nvidia.com>
- Add high-performance media receiving application built on Advanced Network Media Rx operator - Utilizes AdvNetworkMediaRxOp for professional media reception over Rivermax - Integrates with Advanced Network Manager for optimized network resource management - SMPTE 2110 compliant for professional broadcast applications - Real-time reception and processing of media streams with ultra-low latency - Support flexible output modes and format handling - Real-time visualization using HolovizOp for live monitoring - File output capability for recording and analysis - Format conversion support for optimal display - Support multiple video formats and resolutions - RGB888, YUV420, NV12, RGBA and other common video formats - Configurable bit depths (8, 10, 12, 16-bit) - Multiple resolution support up to 4K and beyond - GPU acceleration with GPUDirect for zero-copy operations - Implement dual language support - C++ implementation for maximum performance - Python implementation for ease of use and development Pipeline: Network Interface → Advanced Network Manager → AdvNetworkMediaRxOp → [FormatConverter] → HolovizOp/FramesWriter This application demonstrates how to use the Advanced Network Media operators for professional-grade media reception in broadcast and media production environments requiring ultra-low latency and high throughput performance. Signed-off-by: Rony Rado <rrado@nvidia.com>
…configurable thresholds
- Add adaptive burst dropping system for Rivermax burst pool management
- Implements capacity-aware burst dropping to prevent memory pool exhaustion
- Configurable thresholds for warning (25%), critical (10%), and recovery (50%) levels
- Critical threshold mode drops new bursts when pool capacity falls below threshold
- Automatic recovery when pool capacity returns to healthy levels
- Add comprehensive pool capacity monitoring and statistics
- Extend YAML configuration support for burst pool adaptive dropping
- New "burst_pool_adaptive_dropping" configuration section
- Configurable enabled/disabled flag and threshold percentages
This enhancement prevents memory pool exhaustion in high-throughput ANO Rivermax
operations by intelligently dropping incoming bursts when system capacity
is critically low, ensuring system stability while maintaining optimal
performance under normal operating conditions.
Signed-off-by: Rony Rado <rrado@nvidia.com>
Lint fixes Signed-off-by: Rony Rado <rrado@nvidia.com>
- Install holoscan python SDK - Install libvulkan for holoviz operator - Install xvfb and x11vnc for headless servers Signed-off-by: Rony Rado <rrado@nvidia.com>
…kMediaTxOpImpl Signed-off-by: Dana Lessner <dlessner@nvidia.com>
Signed-off-by: Dana Lessner <dlessner@nvidia.com>
Signed-off-by: Dana Lessner <dlessner@nvidia.com>
The shared manager.cpp and manager.h depend on DPDK headers so the Rivermax build target requires DPDK to be present. - Dockerfile - change the rivermax stage to extend from dpdk instead of base-deps. - manager.cpp DPDK include guard only covered DPDK/GPUNetIO/RDMA builds. Extend it to cover ANO_MGR_RIVERMAX as well. - RivermaxMgr::get_segment_packet_length() and get_packet_length() had uint16_t return types, mismatching the uint32_t declared in the Manager base class. Signed-off-by: Dana Lessner <dlessner@nvidia.com>
def1849 to
db2af4d
Compare
| std::shared_ptr<FrameBufferBase> get_allocated_frame() { | ||
| if (frames_pool_.empty()) { | ||
| throw std::runtime_error("Running out of resources, frames pool is empty"); | ||
| } | ||
| auto frame = frames_pool_.front(); | ||
| frames_pool_.pop_front(); | ||
| return frame; | ||
| } |
There was a problem hiding this comment.
Frame pool exhaustion crashes the pipeline
When frames_pool_ is empty, get_allocated_frame() throws std::runtime_error. This exception propagates uncaught through get_new_frame() → FrameAssemblyController::allocate_new_frame() → packet processing → append_to_frame() → compute(). AdvNetworkMediaRxOp::compute() catches and re-throws it, causing the Holoscan framework to mark the operator as failed and terminate the pipeline entirely. Under backpressure (e.g., a slow downstream HolovizOp holding frames in a buffer), all 250 pool slots can be exhausted, making a crash a realistic production scenario rather than a theoretical one.
The has_available_frames() predicate exists on IFrameProvider but is never consulted before calling get_new_frame() in the assembler path. A graceful fix is to check availability and skip the burst instead of throwing:
std::shared_ptr<FrameBufferBase> get_allocated_frame() {
if (frames_pool_.empty()) {
ANM_LOG_WARN("Frame pool empty, dropping burst");
return nullptr; // caller must handle nullptr gracefully
}
auto frame = frames_pool_.front();
frames_pool_.pop_front();
return frame;
}FrameAssemblyController::allocate_new_frame() already checks for a nullptr return (line 161) and returns false, so the assembler path already has a hook for this — it simply isn't reached because the throw fires first.
There was a problem hiding this comment.
@ronyrad I'd appreciate it if you could take a look at these comments? Thanks
| if (result.new_frame_state == FrameState::RECEIVING_PACKETS && | ||
| !result.should_skip_memory_copy_processing && current_copy_strategy_ && payload) { | ||
| StateEvent copy_strategy_result = current_copy_strategy_->process_packet( | ||
| *assembly_controller_, payload, rtp_params.payload_size); |
There was a problem hiding this comment.
M-bit (end-of-frame) packet payload is never copied into the frame. The memory copy is gated on result.new_frame_state == FrameState::RECEIVING_PACKETS. When the M-bit packet arrives, the state machine transitions to IDLE, so this condition is false and process_packet() is never called for the last packet. Every assembled frame is systematically missing its final RTP packet's payload (~1350 bytes), causing trailing-edge corruption in every output frame.
| CUDA_TRY(cudaFree(frame->get())); | ||
| } | ||
| } | ||
| frames_pool_.clear(); |
There was a problem hiding this comment.
Use-after-free on shutdown when frames are still in-flight downstream. cleanup() calls cudaFree() on all frame buffers in frames_pool_ and ready_frames_, then delete pimpl_. However, frames that have been emitted downstream and not yet released are not in either queue. When downstream operators eventually release those entities, the release_func lambda dereferences a destroyed this pointer and calls push_back on a deleted frames_pool_. The CUDA memory backing in-flight frames is also freed while downstream is still using it.
| pyyaml \ | ||
| scapy | ||
| scapy \ | ||
| holoscan==${HOLOSCAN_DEB_REMOTE_VERSION} \ |
There was a problem hiding this comment.
HOLOSCAN_DEB_REMOTE_VERSION ARG is not in scope in the base-deps stage. The variable is declared globally and re-declared in hsdk-remote, but never re-declared after the FROM hsdk-installed AS base-deps statement. In Docker multi-stage builds, ARGs must be re-declared in each stage. This causes ${HOLOSCAN_DEB_REMOTE_VERSION} to expand to an empty string, making pip install holoscan== fail at build time. Add ARG HOLOSCAN_DEB_REMOTE_VERSION after the FROM for this stage, consistent with how hsdk-remote handles it.
| uint64_t expected_seq = prev_seq + 1; | ||
|
|
||
| if (curr_seq != expected_seq) { | ||
| ANM_STRATEGY_LOG("RTP sequence discontinuity: expected {}, got {} (prev was {})", |
There was a problem hiding this comment.
RTP sequence number wraparound permanently breaks copy strategy detection. Sequences are stored as uint64_t, so when the 32-bit RTP sequence wraps from 0xFFFFFFFF to 0, expected_seq becomes 0x100000000 which never matches the actual next sequence 0. This triggers a perpetual reset() loop. After ~2 hours of 4K/60fps streaming, current_copy_strategy_ becomes permanently null and all subsequent frames are silently discarded. Fix: use modular 32-bit arithmetic, e.g. uint32_t expected_seq = (uint32_t)(prev_seq + 1);
| size_t pointer_diff = curr_ptr - prev_ptr; | ||
|
|
||
| if (i == 1) { | ||
| actual_stride = pointer_diff; |
There was a problem hiding this comment.
Pointer subtraction stored as size_t produces corrupt stride on ring buffer wrap. Pointer subtraction yields signed ptrdiff_t. When curr_ptr < prev_ptr due to ring buffer cycling, the negative value wraps to ~2^64 when stored in size_t, corrupting actual_stride_. This value is subsequently passed as src_pitch to cudaMemcpy2D, causing a CUDA error or silent memory corruption. The same issue exists in is_stride_maintained() with size_t actual_diff = payload - last_packet_ptr_. Fix: use ptrdiff_t and check for negative values.
| visualizer = HolovizOp( | ||
| fragment=self, | ||
| name="visualizer", | ||
| allocator=allocator, |
There was a problem hiding this comment.
Python media player ignores visualize and write_to_file config flags. HolovizOp is created unconditionally here, ignoring media_player_config.visualize and media_player_config.write_to_file. The C++ version correctly checks these flags and conditionally creates either HolovizOp or FramesWriterOp. Setting visualize: false with write_to_file: true in the YAML has no effect in the Python version.
This PR adds multimedia network operators that ease development of media streaming applications with Holoscan Holohub. Two sample applications (sender and player) are provided to demonstrate how to utilize these operators effectively.
New Components Added
🔧 Core Operators
📺 Sample Applications
VideoStreamReplayer → AdvNetworkMediaTxOp → Network InterfaceNetwork Interface → AdvNetworkMediaRxOp → HolovizOp/FramesWriterDeveloper Benefits
Dependencies
This PR builds upon the foundational Rivermax infrastructure provided by PR #813, which includes Rivermax Dev Kit migration and Tx support implementation.
Target Use Cases
Broadcast and media production environments requiring precise timing and high throughput over IP networks. The sample applications serve as reference implementations for developers building similar streaming solutions.
Summary by CodeRabbit
New Features
Enhancements
Documentation